Loading ...
Sorry, an error occurred while loading the content.

2255Re: Bug in Vim' locales when calling Python script (?)

Expand Messages
  • Marijn
    Jan 3, 2007
    • 0 Attachment
      A.J.Mechelynck wrote:
      > Marijn wrote:
      >> Hi,
      >>
      >> I think I found a bug in Vim's UTF8 handling. I've spend 2 days debugging, testing and hair pulling, but I coudn't find the solution the problem and now I think it's a bug. I use Gentoo, Vim 7.0 and UTF8 in the kernel e.t.c. To debug and check the following message I've compiled a fresh vim from the source, but unfortunately the results are the same for either version (source and gentoo build).
      >>
      >> I've got a Python Vim script that fetches (wordpress) content via xmlrpc (the xmlrpc-source has UTF8 encoding). After the content is fetched it is written to the current buffer. This works fine, but when there are any strange characters in the content the script fails (error below). I've deduced the script to the following:
      >>
      >> =================================================
      >>
      >> if has('python')
      >> python << EOF
      >> # -*- coding: utf-8 -*-
      >>
      >> import vim
      >>
      >> def foo():
      >> u = 'a' # This works fine
      >> vim.current.buffer.append(u)
      >>
      >> def bar():
      >> u = unichr(40960) # But this doesn't
      >> vim.current.buffer.append(u)
      >>
      >> EOF
      >> endif
      >>
      >> =================================================
      >>
      >>
      >> When I call this in Vim, foo() works fine, "a" is appended at the end of the buffer, but calling bar() results in the following error:
      >>
      >> Traceback (most recent call last):
      >> File "<string>", line 1, in ?
      >> File "<string>", line 11, in bar
      >> TypeError: bad argument type for built-in operation
      >>
      >>
      >> I've started Vim in utf8 mode:
      >>
      >> marijn@srv ~ $ export LC_ALL=en_US.utf8
      >> marijn@srv ~ $ export LANG=en_US.utf8
      >> marijn@srv ~ $ vim
      >>
      >>
      >> So that can't be the problem.
      >>
      >> I've also created a seperate file:
      >>
      >> =================================================
      >>
      >> #!/usr/bin/python
      >> # -*- coding: utf8 -*-
      >> # test.py
      >> print unichr(40960)
      >>
      >> =================================================
      >>
      >> Running this in a shell with LC_ALL=en_EN.utf8 works fine, with LC_ALL=C it fails, which is normal.
      >>
      >> When running it in Vim ":!python test.py" works fine, the character is printed.
      >> But when I try to insert it in the current buffer ": r !python test.py" it fails: "UnicodeEncodeError: *'ascii' codec* can't encode character u'ua000' in position 0: ordinal not in range(128)"
      >>
      >> Maybe this is because of the 'r' function not handling UTF8 well (http://vimdoc.sourceforge.net/htmldoc/mbyte.html#UTF-8), but I'm not sure of that. But for completeness I wanted to include this as well.
      >>
      >> I think this (especially the first part) is a bug of Vim, I hope you can acknowledge this, and/or help to find a solution.
      >>
      >>
      >> Tia and best wishes,
      >>
      >> Marijn Koesen
      >>
      >
      > 1. Is your Vim executable built with +multi_byte?
      > :echo has("multi_byte")
      > should answer 1
      > If the answer is zero, you should install a Vim executable with +multi_byte compiled-in.
      >
      > 2. Do you have 'encoding' set to UTF-8?
      > :set enc?
      > should answer
      > encoding=utf-8
      > If the answer is something else (but it passes test 1 above), tell me what it is and I'll tell you what to add to your vimrc.
      >
      > If the answer to either question is "no", Vim cannot handle UTF-8 codepoints above U+007F (or maybe U+00FF, depending).
      >
      >
      > Best regards,
      > Tony.
      >


      1) Yes, it's compiled with multi_byte:

      Some more details about my vim version:

      :version
      VIM - Vi IMproved 7.0 (2006 May 7, compiled Jan 2 2007 00:32:34)
      Included patches: 1-17
      Modified by Gentoo-7.0.17
      Compiled by root@henk
      Huge version without GUI. Features included (+) or not (-):
      +arabic +autocmd -balloon_eval -browse ++builtin_terms +byte_offset +cindent -clientserver -clipboard +cmdline_compl
      +cmdline_hist +cmdline_info +comments +cryptv -cscope +cursorshape +dialog_con +diff +digraphs -dnd -ebcdic +emacs_tags +eval
      +ex_extra +extra_search +farsi +file_in_path +find_in_path +folding -footer +fork() +gettext -hangul_input +iconv
      +insert_expand +jumplist +keymap +langmap +libcall +linebreak +lispindent +listcmds +localmap +menu +mksession +modify_fname
      +mouse -mouseshape +mouse_dec +mouse_gpm -mouse_jsbterm +mouse_netterm +mouse_xterm +multi_byte +multi_lang -mzscheme
      -netbeans_intg -osfiletype +path_extra +perl +postscript +printer +profile +python +quickfix +reltime +rightleft +ruby
      +scrollbind +signs +smartindent -sniff +statusline -sun_workshop +syntax +tag_binary +tag_old_static -tag_any_white -tcl
      +terminfo +termresponse +textobjects +title -toolbar +user_commands +vertsplit +virtualedit +visual +visualextra +viminfo
      +vreplace +wildignore +wildmenu +windows +writebackup -X11 -xfontset -xim -xsmp -xterm_clipboard -xterm_save
      system vimrc file: "/etc/vim/vimrc"
      user vimrc file: "$HOME/.vimrc"
      user exrc file: "$HOME/.exrc"
      fall-back for $VIM: "/usr/share/vim"
      Compilation: i686-pc-linux-gnu-gcc -c -I. -Iproto -DHAVE_CONFIG_H -march=pentium3 -O2 -pipe -fomit-frame-pointer -pipe
      -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/lib/perl5/5.8.8/i686-linux/CORE -I/usr/include/python2.4 -pthread -I/usr/
      lib/ruby/1.8/i686-linux
      Linking: i686-pc-linux-gnu-gcc -rdynamic -Wl,-export-dynamic -rdynamic -L/usr/local/lib -o vim -lncurses -lgpm -r
      dynamic -L/usr/local/lib /usr/lib/perl5/5.8.8/i686-linux/auto/DynaLoader/DynaLoader.a -L/usr/lib/perl5/5.8.8/i686-linux/CORE
      -lperl -lutil -lc -L/usr/lib/python2.4/config -lpython2.4 -lpthread -lutil -Xlinker -export-dynamic -Wl,-R -Wl,/usr/lib -L/us
      r/lib -L/usr/lib -lruby18 -lm


      2) Yes, all the files that I have used and created have (had) the utf8 encoding. I've tested all the files by "set enc".


      Best regards,

      Marijn
    • Show all 8 messages in this topic