Loading ...
Sorry, an error occurred while loading the content.

RE: [Clip] How to fix non-Ascii characters using NoteTab

Expand Messages
  • John Shotsky
    It isn t perfect as it is, but I m working through the issues. For one thing, a right smart quote is three hex characters, which you can t copy/paste into a
    Message 1 of 33 , Aug 31, 2013
    • 0 Attachment
      It isn't perfect as it is, but I'm working through the issues. For one thing, a right smart quote is three hex characters, which you
      can't copy/paste into a replace. But, this works:
      ^!Replace "\xE2\x80\x9d" >> """ ARSW
      ^!IfError Next Else Skip_-1
      EditPad Pro is great for showing the hex.
      Regards,
      John
      RecipeTools Web Site: <http://recipetools.gotdns.com/> http://recipetools.gotdns.com/
      John's Mags Yahoo Group: <http://groups.yahoo.com/group/johnsmags/> http://groups.yahoo.com/group/johnsmags/

      From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Axel Berger
      Sent: Saturday, August 31, 2013 17:23
      To: ntb-clips@yahoogroups.com
      Subject: Re: [Clip] How to fix non-Ascii characters using NoteTab


      John Shotsky wrote:
      > It amounts to opening the file in NoteTab in UTF-8 (no conversion)
      > mode. That way, it doesn't throw away any characters.

      If that works, and I've just confirmed that it does, it certainly is a
      better method than all that copying. For some reason, possibly my own
      mistakes, I could not make it work consistently in the past.

      > I think the gibberish will always be the same,

      Absolutely. The encoding of UTF-8 for all characters greater than 127 is
      precisely defined. Whatever it may look like, the new feature of showing
      you the byte value of single selected characters in the status line will
      verify it. This is what makes my conversion from UTF byte sequences to
      entities possible. As that works for ALL UTF sequences, not just those
      known to you, it may be the best solution anyway.

      I shall try to use the load as raw UTF more in the future. Shame one
      can't make that setting the default.

      Axel



      [Non-text portions of this message have been removed]
    • Roopakshi Pathania
      Hi Axel, I ve not been following this thread, but will throw out a couple of suggestions based on what I ve read. If you wish to use those fraction characters
      Message 33 of 33 , Sep 7, 2013
      • 0 Attachment
        Hi Axel,

        I've not been following this thread, but will throw out a couple of suggestions based on what I've read.
        If you wish to use those fraction characters both for entering/ back converting into NTP or converting them into HTML, why not try MathML or LaTeX?
        MathML may be a bit tedius, but it is appropriate for HTML form, and is readable as well as replaceable in any text editor.
        LaTeX can be entered and converted into HTML using TeX4HT. It is also replaceable.

        Again, since I didn't read most mails, I'm not sure if my suggestions would help.

        Sent from my Lenovo ThinkPad

        --------------------------------------------
        On Sun, 9/1/13, Axel Berger <Axel-Berger@...> wrote:

        Subject: Re: [Clip] How to fix non-Ascii characters using NoteTab
        To: ntb-clips@yahoogroups.com
        Date: Sunday, September 1, 2013, 12:31 AM
















         









        John Shotsky wrote:

        > I use EditPad Pro on an expired trial for working with
        Unicode files.

        > When I open the html file with EditPad I can see these
        characters

        > just fine.



        That may well be the problem. That and some shenanigans
        Windows itself

        engages in with copying and pasting.



        > I have taken the liberty of cc'ing your personal
        email address,

        > and have attached the html.



        I have opened the html in firefox and a UTF UNaware simple
        editor. In

        the first I see all characters and copying and pasting
        translates them

        from UTF to ANSI or an ASCII equivalent thus:



        ¼ cup flour

        ¾ cup milk

        1/3 cup flour



        The editors shows me the individual bytes the characters are
        made of and

        I can copy them to NT unchanged:



        >¾</strong> cup milk</div>

        >¼ cup flour</strong></div>

        >â…“ cup flour</strong></div>



        Running my own UTF script over them yields:



        >¾</strong> cup milk</div>

        >¼ cup flour</strong></div>

        >⅓ cup flour</strong></div>



        (Converting everything possible to cp-1252 = ANSI is on
        purpose.

        Omitting those parts it would be even easier to make
        everything an

        entity.)



        There may be OS issues here too. Parts of eXPerimental are
        UTF-aware and

        might interfere. I'm using Win98SE, but I doubt
        that's the difference.

        (To try I'd need to install stuff first.)



        Axel
      Your message has been successfully submitted and would be delivered to recipients shortly.