On 5/29/08, olivier vasseur <oliviervasseurtrad@...
> In French we also need to add spaces, more precisely
> unbreakeable spaces before some punctuation marks.
I hadn't thought that far ahead. Yes, eliminating spaces before
English commas, semicolons, and colons would be nice. (But not periods
because of file extensions like .txt and .odt.) I give this low
priority, however, because, as you point out, rules are
language-specific (or continent-specific for English). No "one size
fits all" means adding configuration options.
What I had in mind was the traditional text editor option of
automatically deleting spaces at the end of lines. (And MS Word's bad
habit of allowing spaces to accumulate at the end of paragraphs.)
I'd say go one step further: at most one space (s/ +/ /) between and
zero at the ends (s/^[ \t\n]+// and s/[ \t\n]+$//).
Looking further ahead: What about callbacks applying such processing
to source and target segments as the translator step from segment to
I have a language-specific need to insert spaces in the input to make
glossaries usable (Chinese and Japanese don't use spaces) and remove
excess ones in the output. The latter filter should be very easy to
implement. The former will take time.