Loading ...
Sorry, an error occurred while loading the content.

996RE: UTF8 issue

Expand Messages
  • Fernando Munoz
    Jan 29, 2003
      Thanks Phillip, that solves the problem. I managed myself to find a less
      elegant but, equally effective, solution. I operates over the string passing
      the result to a second scalar that gets encoded as a string of bytes:

      my ($description, $value) = split(":",$biblio[$n]); <- These are UTF8
      my $value = sprintf("%4.2f", $value); <- Here $value goes back to a string
      of bytes
      my $lstring = length($description);
      my $newdesc = substr($description,0,$lstring); <- Here $newdesc has
      $description as a string of bytes

      After this the digests are all different and correct. It is not elegant but

      Thanks again.

      -----Original Message-----
      From: Philip Mak [mailto:pmak@...]
      Sent: Wednesday, January 29, 2003 10:07 AM
      To: Fernando Munoz
      Cc: 'asp@...'
      Subject: Re: UTF8 issue

      I'm guessing you'll have to somehow "cast" the UTF8 strings so that
      they're interpreted byte-by-byte, rather than character-by-character.

      Maybe try "use utf8;" and then pass utf8::encode($str) instead of $str
      to the MD5 function.

      On Wed, Jan 29, 2003 at 09:50:13AM -0800, Fernando Munoz wrote:
      > Well, there's no error logging that I can refer to, but when you try
      > to hexdec these strings (the ones coming in UTF8) no matter how
      > different the strings are, they always return the same digest.
      > Searching around I find this note :
      > "Perl 5.8 support Unicode characters in strings. Since the MD5
      > algorithm is only defined for strings of bytes, it can not be used
      > on strings that contains chars with ordinal number above 255. The
      > MD5 functions and methods will croak if you try to feed them such
      > input data:"
      > in the documentation for Digest::MD5
      > (http://search.cpan.org/author/GAAS/Digest-MD5/MD5.pm).
      Lions Gate Entertainment, Inc. [ AMEX: lgf ]
      Five Proud Years, One Independent Spirit.

      To unsubscribe, e-mail: asp-unsubscribe@...
      For additional commands, e-mail: asp-help@...
    • Show all 8 messages in this topic