Loading ...
Sorry, an error occurred while loading the content.
 

line2byte() returns wrong result at multi-byte characters

Expand Messages
  • Дмитрий Франк
    line2byte() does not care multi-byte characters. For example, if my buffer has file-encoding utf-8, and there s some cyrillic characters in the buffer (each
    Message 1 of 33 , Dec 19, 2011
      line2byte() does not care multi-byte characters.

      For example, if my buffer has file-encoding utf-8, and there's some cyrillic characters in the buffer (each cyrillic charater takes 2 bytes), then line2byte('.') returns wrong result (it doesn't care about multi-byte characters)

      Regards,
      Dmitry.

      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php
    • Ingo Karkat
      ... Yeah, but as long as nobody puts this into statusline , it will only be invoked occasionally, both for go / :goto and the OP s use case of Eclim
      Message 33 of 33 , Jan 4, 2012
        On 04-Jan-2012 17:22, Bram Moolenaar wrote:

        > Ingo Karkat wrote:
        >
        >> On 04-Jan-2012 00:39, Ben Fritz wrote:
        >>
        >>> [...]
        >>> I don't notice anything about
        >>> line2byte() in the todo list, so I'm expressing my support again
        >>> either for a new function, or an optional argument to line2byte();
        >>> either telling it to use 'fileencoding' or giving it an encoding to
        >>> use.
        >>
        >> +1; the current Vimscript workaround is ugly and inefficient.
        >>
        >> I also still would like to get a clarification how "go" / :goto should
        >> behave in this particular case. I would prefer an overloaded :goto!
        >> command that uses 'fileencoding' instead of 'encoding' as the base for
        >> counting the bytes. This should be trivial to add after the
        >> line2byte() enhancement is done.
        >>
        >> On 20-Dec-2011 08:45, Ingo Karkat wrote:
        >>
        >>> I'd like to direct the attention also to the related "go" / :goto
        >>> commands. I have rarely used them, but I imagine that the main use
        >>> case is that some external tool is pointing me to a byte offset N in a
        >>> file, and I want to use Vim to quickly check that position. For that
        >>> to work (when &fileencoding differs from &encoding, i.e. what this
        >>> entire discussion is about), Vim would have to use fileencoding-based
        >>> counts. Currently, I would need to use a tool like xxd to achieve that
        >>> goal (assuming I cannot just temporarily switch &encoding, like when
        >>> &fileencoding is ucs-2 or ucs-4).
        >>
        >> Bram, shouldn't "go" be based on 'fileencoding' rather than the internal
        >> representation in Vim (in the few cases where there's actually a distinction)?
        >> Can you please give your blessing to a line2byte() / :goto enhancement and put
        >> it into the todo list, so that this eventually gets implemented?
        >
        > Supporting the byte count for anything but 'encoding' is going to be
        > terrible inefficient. It would require conversion between 'encoding'
        > and 'fileencoding' on every command and/or storing the cached byte
        > offset twice (and recomputing it when changing 'fileencoding').

        Yeah, but as long as nobody puts this into 'statusline', it will only be invoked
        occasionally, both for "go" / :goto and the OP's use case of Eclim completion.
        For me, the real question is how much effort is required to implement this;
        though I suppose a naive implementation can be based on the proposed Vimscript
        workaround.

        > Since utf-8 is more and more becoming the standard file encoding
        > everywhere it's better to standardize on that.

        As long as files with other encodings exist, people may want to jump to a
        particular byte offset. Why not just put in on the todo list, and see what
        happens first: the demise of file encodings, or someone contributing a patch :-)

        -- regards, ingo

        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php
      Your message has been successfully submitted and would be delivered to recipients shortly.