Loading ...
Sorry, an error occurred while loading the content.
 

Change in script behavior with encoding

Expand Messages
  • Srinath Avadhanula
    Hello, Recently, I noticed that the behavior of a vim script file changes based on what encoding the user is currently using. I tried reading the encoding
    Message 1 of 3 , Dec 1 5:33 PM
      Hello,

      Recently, I noticed that the behavior of a vim script file changes based
      on what encoding the user is currently using. I tried reading the
      encoding documentation briefly, but got lost quite quickly.

      Please try out the following steps:

      1. open a new vim file with encoding set to latin1.

      2. type in the following stuff: (the « character can be typed as
      CTRL-Q + < + < or CTRL-V + < + <)

      let MyVar = '«'

      if MyVar =~ '«'
      echo 'does contain quotes'
      else
      echo 'no quotes here'
      endif

      3. trying sourcing it (:so %)
      you will see 'does contain quotes'

      4. now set encoding to utf8. (:set encoding=utf8)

      5. source it again.
      this time you will see 'no quotes here'

      Could someone explain? How can I get the same behavior from the script
      file irrespective of what encoding the user is currently using?

      Thanks,

      Srinath
    • Steve Hall
      ... I just went through this with our scripts. I found that any character with a decimal value above 127 is asking for trouble outside of &encoding=latin1. So
      Message 2 of 3 , Dec 1 6:37 PM
        On Sun, 2002-12-01 at 20:33, Srinath Avadhanula wrote:
        >
        > Recently, I noticed that the behavior of a vim script file changes
        > based on what encoding the user is currently using. I tried reading
        > the encoding documentation briefly, but got lost quite quickly.
        >
        > [snip]
        >
        > Could someone explain? How can I get the same behavior from the script
        > file irrespective of what encoding the user is currently using?

        I just went through this with our scripts. I found that any character
        with a decimal value above 127 is asking for trouble outside of
        &encoding=latin1. So I started conditioning based on encoding whenever
        I want something "excessive":

        if &encoding == "latin1"
        let mybookmarksymbol = "»»"
        else
        let mybookmarksymbol = "=>"
        endif

        A user on Chinese Windows said it cleared up all his multi-byte issues
        with our scripts (720K total).

        HTH.

        Steve Hall [ digitect(at)mindspring.com ]
      • Antoine J. Mechelynck
        ... Maybe it works for you, Steve, because you never use any single-byte encoding other than Latin1. But if you write scripts for general consumption, beware
        Message 3 of 3 , Dec 1 8:10 PM
          Steve Hall <digitect@...> wrote:
          > On Sun, 2002-12-01 at 20:33, Srinath Avadhanula wrote:
          > >
          > > Recently, I noticed that the behavior of a vim script file changes
          > > based on what encoding the user is currently using. I tried reading
          > > the encoding documentation briefly, but got lost quite quickly.
          > >
          > > [snip]
          > >
          > > Could someone explain? How can I get the same behavior from the script
          > > file irrespective of what encoding the user is currently using?
          >
          > I just went through this with our scripts. I found that any character
          > with a decimal value above 127 is asking for trouble outside of
          > &encoding=latin1. So I started conditioning based on encoding whenever
          > I want something "excessive":
          >
          > if &encoding == "latin1"
          > let mybookmarksymbol = "»»"
          > else
          > let mybookmarksymbol = "=>"
          > endif
          >
          > A user on Chinese Windows said it cleared up all his multi-byte issues
          > with our scripts (720K total).
          >
          > HTH.
          >
          > Steve Hall [ digitect(at)mindspring.com ]

          Maybe it works for you, Steve, because you never use any single-byte
          encoding other than Latin1. But if you write scripts for general
          consumption, beware that the people using them might use any kind of
          encodings, some of them single-byte (Latin1, Eastern Europe, Greek,
          Cyrillic, ...) others multibyte (utf8, East-Asian).

          The reason the script acts differently is that it is read according to the
          current encoding settings (see section 34 of the Vim FAQ and my tip
          http://vim.sourceforge.net/tip_view.php?tip_id=246 ); not only 'encoding'
          but 'fileencoding' and 'fileencodings' are relevant here.

          Maybe a modeline (:help modeline) in a Vim comment near the start of the
          script, setting the 'fileencoding' locally for the script, would help; it's
          not perfect though, since it has to be read before it is interpreted.

          Tony.
        Your message has been successfully submitted and would be delivered to recipients shortly.