Loading ...
Sorry, an error occurred while loading the content.

107013RE: improving the :join command

Expand Messages
  • Gene Kwiecinski
    Sep 1, 2009
    • 0 Attachment
      >>No disrespect intended, but *why* in B'Harni's Dark Name would you
      >>want to join >10000 lines into 1?!?

      >There might be usecases. Data is growing rapidly today, and I myself
      >had to manage automatically generated text-files of several hundred MB
      >of size. Plus there have occasionally been questions on this list
      >regarding joining lines.

      Even so, something which I can understand, eg, a logfile, should be
      delineated by linebreaks. Raw xml/sgml/etc., should be edited as a
      sequence of modestly-sized lines, then if necessary, joined to a single
      line after saving (or before saving, if you have the time :D ).


      >Well just one simple test:

      >#v+
      >~$ for i in 1 2 4 8 16 32 64 128; do
      > seq 1 $(($i*1000)) >tempfile
      > echo "joining $i kilo lines"
      > time vim -u NONE -N -c ':%join|:q!' tempfile;
      >done
      >#v-

      >and compare the timings yourself. Doesn't this look like a bug to you?

      I have no idea, as I didn't run it yet. Offhand, an exponential
      increase wouldn't be out of the question, ie, e^n.

      Don't forget *physical* limits such as available memory. Once you bang
      your head on that memory-ceiling and start having to swap to disk, all
      bets are off, and processing time can increase by order*s* of magnitude,
      depending how bad it is. Hell, I run into that in *perl*, let alone
      'gvim', when intentionally joining huge files to a single line to c&p
      whole sections of the file! And I'm not even dealing with syntax
      highlighting, colorschemes, and the like.


      >>Any 'vi' variant is a *line*-based editor, which presumed a modest
      >>line-size for each. Juggling lines back and forth is easy, but
      heaving
      >>huge MB-sized chunks o' text is just obscene. Add to that
      syntax-based
      >>highlighting, multiple colors, etc., and all the processing required
      for
      >>just *1* line adds exponentially to the amount of work involved, let
      >>alone cursor motions, etc.

      >Well Vim is an editor. Shouldn't it be able to join properly millions
      >of lines, even if that sounds strange? The power of vim comes from

      Sure, it should be able to be pushed to its limits and do so, but not
      necessarily *efficiently*. Ie, it may hit that aforementioned ceiling
      and then start hitting the disk to do so, and pretty much require you to
      leave it running overnight to go and join a brazillion lines into 1
      Uberline. That's not necessarily a "bug", just an unexpected excursion
      of its performance envelope. The fact that it can create a huge
      Uberline without *crashing* is a testament to the robustness of the
      code. An old version of 'vi' I had would vomit on lines >300chars or
      so.

      Point being, *line*-editors are meant to be used with *lines*, and lines
      of a modest size. The fact that it *can* handle Uberlines is great, but
      you can't expect it to be handled "efficiently". The kind of advice I
      might give would be along the lines of the guy who sees his doctor:

      guy: "Doc, it hurts when I do this."
      doc: "So don't do that."


      >the fact, that you can do many different manipulations very
      >efficiently and does not limit you.

      Absolutely, but again, recall that it's intended to be a *line*-editor.
      Not to appear facetious in repeating that again, but that's what
      'vim'/'gvim' happens to be, a *line*-editor. You yank and put *lines*.
      You add *lines*. You delete *lines*. Hell, syntax highlighting becomes
      downright painful for overly-long lines that people wrote add-ons to
      stop highlighting after N columns! That should be Clue #1 that
      overly-long lines are not "natural" to a *line*-editor.


      >Plus :h limits does not talk about joining only a couple of lines ;)

      Of course not. I can 'ls' a list of filenames into a file, do a ':%j'
      to get them into a single line, then prepend a command to run (with the
      filelist as the list of files to operate on) and make an instant
      batchfile. Works great. But there's a huge difference between a
      batch-/shell-command that's 1000chars long, and a 1-line file with a
      100Mchar Uberline.


      >>Dunno, but to me, that seems like using a text editor to edit a .jpg
      or
      >>.gif or something, ie, not the right tool for the job, even if,
      through
      >>herculean contortions and torturing the editor's functionality, it
      *can*
      >>be done.

      >Exactly. It can. And it might be done by someone.

      And if he has the luxury of letting it run overnight, great. :D


      >>I'd, if anything, edit the file as needed, save it, then use 'sed',
      >>'tr', etc., to post-process it accordingly. No overhead for syntax,
      >>colorschemes, etc. Ie, use the right tool for the job.

      >Yeah, but sed, tr, awk, perl, $language is not always available. And
      >Vim should be able to do it right.

      >What was the reason again to add :vimgrep to vim when grep is
      >available?

      I have no idea, as I don't recall ever using it. <shrug/>


      To reiterate, I *don't* want to appear to be argumentative, but I'm just
      saying that handing Uberlines is something that's *possible* in
      'vim'/'gvim', but don't expect it to be handled "efficiently", not if
      it's well outside the usual performance envelope of file-editing.

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_use" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Show all 10 messages in this topic