Loading ...
Sorry, an error occurred while loading the content.

Re: [NH] Notetab refuses to perform edits on this .html file

Expand Messages
  • stitch.happy
    Thanks, John and Marcelo, for your suggestions. It turned out to be a Unicode character, a checkmark, that was the problem. I appreciate the help! I used
    Message 1 of 9 , Oct 10, 2012
    • 0 Attachment
      Thanks, John and Marcelo, for your suggestions. It turned out to be a Unicode character, a checkmark, that was the problem. I appreciate the help!

      I used NotePad to open the file and did a Save-As and selected ANSI format. Got a warning that said I was about to lose Unicode formatted characters. Saved as a new file and did a file compare (CompareIt!) between the two files.

      Regards,
      Bev

      --- In ntb-html@yahoogroups.com, Marcelo Bastos <bytext@...> wrote:

      > I didn't check, but most time when I couldn't edit a file, it turned out
      > to be a Unicode file. Notetab has limited Unicode support.
    • stitch.happy
      Thanks, Axel. I sent the prev reply before seeing this. This looks very handy. -Bev
      Message 2 of 9 , Oct 10, 2012
      • 0 Attachment
        Thanks, Axel. I sent the prev reply before seeing this. This looks very handy.

        -Bev

        --- In ntb-html@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
        >
        > Marcelo Bastos wrote:
        > > The problem: if there were Unicode characters there, you lost them.
        >
        > Which is why that's not the way to do it. Hope the following is correct
        > (i.e. works first time), I really hate this "feature". You can
        > a) Open the file as codepage (UTF-8 (no conversion)" and possibly also
        > switch off document --> Read only.
        > or
        > b) Open an empty document and your page in another editor and copy and
        > paste all of it over.
        >
      • stitch.happy
        Sweet. Worked the first time. Now a part of my clip library of handy stuff, with attribution to Axel. I used method (a). Thanks, Axel! And thanks for the
        Message 3 of 9 , Oct 10, 2012
        • 0 Attachment
          Sweet. Worked the first time. Now a part of my clip library of handy stuff, with attribution to Axel. I used method (a). Thanks, Axel! And thanks for the hint to unwrap broken long lines. Hints like that to the newbies keeps the frustration down. Keep up the good work folks!

          Bev

          --- In ntb-html@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
          >
          > Marcelo Bastos wrote:
          > > The problem: if there were Unicode characters there, you lost them.
          >
          > Which is why that's not the way to do it. Hope the following is correct
          > (i.e. works first time), I really hate this "feature". You can
          > a) Open the file as codepage (UTF-8 (no conversion)" and possibly also
          > switch off document --> Read only.
          > or
          > b) Open an empty document and your page in another editor and copy and
          > paste all of it over.
          >
          > To get rid of the UTF characters and convert them to HTML entities you
          > can run this clip:
          >
          > ---------------------------------------------------------------
          > :loop
          > ^!Find "[\xC0-\xF7][\x80-\xBF]*" RS
          > ^!IfError donelatin
          > ^!IfMatch "[\xC2-\xC3][\x80-\xBF]" "^$GetSelection$" latin1
          > ^!IfMatch "[\xC0-\xDF][\x80-\xBF]" "^$GetSelection$" zwei
          > ^!IfMatch "[\xE0-\xEF][\x80-\xBF]{2}" "^$GetSelection$" drei
          > ^!IfMatch "[\xF0-\xF7][\x80-\xBF]{3}" "^$GetSelection$" vier
          > ^!Continue Illegal sequence, can't be converted.
          > ^!Goto loop
          > :zwei
          > ^!Set %first%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";2)$)$ MOD
          > 64)$
          > ^!Set %second%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";1)$)$ MOD
          > 32)$
          > ^!Set %third%=0
          > ^!Set %fourth%=0
          > ^!Goto makeent
          > :drei
          > ^!Set %first%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";3)$)$ MOD
          > 64)$
          > ^!Set %second%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";2)$)$ MOD
          > 64)$
          > ^!Set %third%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";1)$)$ MOD
          > 16)$
          > ^!Set %fourth%=0
          > ^!Goto makeent
          > :vier
          > ^!Set %first%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";4)$)$ MOD
          > 64)$
          > ^!Set %second%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";3)$)$ MOD
          > 64)$
          > ^!Set %third%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";2)$)$ MOD
          > 64)$
          > ^!Set %fourth%=^$Calc(^$CharToDec(^$StrIndex("^$GetSelection$";1)$)$ MOD
          > 8)$
          > :makeent
          > ^!Set
          > %first%=^$Calc(262144*^%fourth%+4096*^%third%+64*^%second%+^%first%;0)$
          > ^!InsertText &#^%first%;
          > ^!Goto loop
          > :latin1
          > ^!Set %first%=^$StrCopyRight("^$GetSelection$";1)$
          > ^!Set %second%=^$StrCopyLeft("^$GetSelection$";1)$
          > ^!Set %first%=^$Calc(^$CharToDec(^%first%)$ MOD 64)$
          > ^!Set %second%=^$Calc(^$CharToDec(^%second%)$ MOD 4)$
          > ^!InsertText ^$DecToChar(^$Calc(64*^%second%+^%first%)$)$
          > ^!Goto loop
          > :donelatin
          > ^!Replace "€" >> "€" WASTI
          > ^!Replace "Š" >> "Š" WASTI
          > ^!Replace "š" >> "š" WASTI
          > ^!Replace "Ž" >> "Ž" WASTI
          > ^!Replace "ž" >> "ž" WASTI
          > ^!Replace "Œ" >> "Œ" WASTI
          > ^!Replace "œ" >> "œ" WASTI
          > ^!Replace "Ÿ" >> "Ÿ" WASTI
          > ---------------------------------------------------------------
          >
          > Beware of broken long lines. Each line begins with either "^" or ":".
          >
          >
          >
          >
          >
          >
          >
          > RA
          >
          >
          >
          >
          > You
          > > then have to figure out what they were and where they went originally.
          > > And then you have to find out the character entities for them and enter
          > > them manually.
          > >
          > > One way to do that, I found, is by using Microsoft Word. Open the
          > > original file in Word, save it as "Web page, filtered." Word is pretty
          > > useless as a HTML editor, but it does have good Unicode support, and it
          > > will usually convert Unicode to a Win-1252 file with all the
          > > 1252-incompatible characters to HTML numbered entities. Then you open
          > > this file in Notepad, search for "&#", and there you have it, the
          > > mystery characters.
          > >
          > > And that is the second reason I still keep Word in my computer, since I
          > > hardly ever use it for writing nowadays. (The first reason is that the
          > > file-compare feature in Word is pretty kickass, and I have to compare
          > > files now and then).
          > >
          > > --
          > > MCBastos
          > >
          > > This message has been protected with the 2ROT13 algorithm. Unauthorized use will be prosecuted under the DMCA.
          > > -=-=-
          > > ... Sent from my HAL 9000.
          > > * Added by TagZilla 0.7a1 running on Seamonkey 2.12.1 *
          > > Get it at http://xsidebar.mozdev.org/modifiedmailnews.html#tagzilla
          > >
          > > ------------------------------------
          > >
          > > Fookes Software: http://www.fookes.com/
          > > NoteTab website: http://www.notetab.com/
          > > NoteTab Discussion Lists: http://www.notetab.com/groups.php
          > >
          > > ***
          > > Yahoo! Groups Links
          > >
          > >
          > >
          > --
          > Dipl.-Ing. F. Axel Berger Tel: +49/ 2174/ 7439 07
          > Johann-Häck-Str. 14 Fax: +49/ 2174/ 7439 68
          > D-51519 Odenthal-Heide eMail: Axel-Berger@...
          > Deutschland (Germany) http://berger-odenthal.de
          >
        • Marcelo Bastos
          ... That s a very nice piece of clip programming, and yes, it DID work first time. (Well, after I fixed a couple statements that had been line-wrapped by the
          Message 4 of 9 , Oct 10, 2012
          • 0 Attachment
            Interviewed by CNN on 10/10/2012 07:01, Axel Berger told the world:
            > Marcelo Bastos wrote:
            >> The problem: if there were Unicode characters there, you lost them.
            > Which is why that's not the way to do it. Hope the following is correct
            > (i.e. works first time), I really hate this "feature". You can
            > a) Open the file as codepage (UTF-8 (no conversion)" and possibly also
            > switch off document --> Read only.
            That's a very nice piece of clip programming, and yes, it DID work first
            time. (Well, after I fixed a couple statements that had been
            line-wrapped by the mail systems, that is.) Thank you, it will prove
            most useful in the coming weeks.
            I had a quick look at the logic, and it seems to be generic enough to
            tackle the entire Basic Multilingual Plane. Which is good, since I have
            deal with a couple text sources who just *love* to use obscure
            characters from languages you never heard about for aesthetic effect.

            I'm already thinking about four or five ways I can integrate it into my
            workflow. It will probably end up as the main subroutine of a larger
            clip. I'm thinking of starting with an auto-reload of the file as "UTF-8
            (no conversion)," then a preprocessing search-and-replace to get rid of
            the most common cases, like "smart quotes" (not strictly needed, but it
            should speed up the process quite a bit), and a post-processing
            "cleanup" phase using a couple clips I already have in hand.

            --

            MCBastos This message has been protected with the 2ROT13 algorithm.
            Unauthorized use will be prosecuted under the DMCA.

            -=-=-
            ... Sent from my Total Lack of Social Skills.
            * Added by TagZilla 0.7a1 running on Seamonkey 2.13 *
            Get it at http://xsidebar.mozdev.org/modifiedmailnews.html#tagzilla
          • Axel Berger
            ... Even more than that, it will also translate illegal UTF into equally illegal entities. I have another clip that checks a document for legal UTF and flags
            Message 5 of 9 , Oct 10, 2012
            • 0 Attachment
              Marcelo Bastos wrote:
              > I had a quick look at the logic, and it seems to be generic enough to
              > tackle the entire Basic Multilingual Plane.

              Even more than that, it will also translate illegal UTF into equally
              illegal entities. I have another clip that checks a document for legal
              UTF and flags errors such as ANSI characters.

              ---------------------------------------------------------------
              :loop
              ^!Find "([\x80-\xBF]|[\xC0-\xFF][\x80-\xBF]*)" RS
              ^!IfError usasc
              ^!IfMatch "[\xC2-\xDF][\x80-\xBF]" "^$GetSelection$" loop
              ^!IfMatch "\xE0[\xA0-\xBF][\x80-\xBF]" "^$GetSelection$" loop
              ^!IfMatch "[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}" "^$GetSelection$" loop
              ^!IfMatch "\xED[\x80-\x9F][\x80-\xBF]" "^$GetSelection$" loop
              ^!IfMatch "\xF0[\x90-\xBF][\x80-\xBF]{2}" "^$GetSelection$" loop
              ^!IfMatch "[\xF1-\xF3][\x80-\xBF]{3}" "^$GetSelection$" loop
              ^!IfMatch "\xF4[\x80-\x8F][\x80-\xBF]{2}" "^$GetSelection$" loop
              ^!Continue Illegal sequence, no UTF-8
              ^!Goto loop
              :usasc
              ^!Continue No errors found
              ---------------------------------------------------------------

              Axel
            Your message has been successfully submitted and would be delivered to recipients shortly.