Loading ...
Sorry, an error occurred while loading the content.
 

Re: [soaplite] Soap and encoding of non ASCII literals

Expand Messages
  • cedric.boufflers
    Hello Duncan and the list readers, I have been doing some experimentation. I have written a simple Web service in PERL : This is my method : sub get_Champs {
    Message 1 of 6 , Jun 24, 2005
      Hello Duncan and the list readers,

      I have been doing some experimentation. I have written a simple Web
      service in PERL :

      This is my method :

      sub get_Champs
      {
      my $class = shift;
      my $envelope = pop;

      my $champ = $envelope->valueof("//get_Champs/Champ");

      use Data::HexDump;

      return SOAP::Data->name('result' => HexDump($champ));
      }

      It just return an Hexadecimal dump of the string received.

      I have called it with a Java standard Java client :


      *- First Test*
      System.out.println(wsenc.get_Champs("cédric"));

      In response I had :
      00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF

      00000000 63 E9 64 72 69 63 c.dric

      So it seems my accent is encoded on a single byte there, and PERL does
      not deal the string in UTF-8 in this case.


      *- Second Test*
      System.out.println(wsenc.get_Champs(new
      String("cédric".getBytes("UTF-8"))));

      In response I had :
      00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F
      0123456789ABCDEF

      00000000 63 C3 A9 64 72 69 63 c..dric

      In this case I have a double bytes encoded accentued character. Is it
      because in this case I am doing double encoding? Although in this case
      in PERL it is seen as an UTF-8 string.

      How could I force PERL or SOAP::Lite to always deal with the string in
      UTF-8 ?

      I have tried to add this lines :

      use POSIX qw(locale_h);
      setlocale(LC_CTYPE, "en_US.UTF-8");

      But it changes nothing and the default locale of the computer is :
      LANG=en_US.UTF-8
      LANGVAR=en_US.UTF-8

      But nothing does.

      Best Regards,
      And thank you for your help.

      Cédric

      Note :
      SOAP::Lite is 0.60
      Perl is perl5 (revision 5.0 version 8 subversion 0).



      Duncan Cameron a écrit :

      >At 2005-06-16, 14:05:55 you wrote:
      >
      >
      >>Hello Duncan,
      >>
      >>So if understand well I might be trying to double encode the strings.
      >>
      >>But what has made me done that, and might have misleading me, is the
      >>error I was getting from the encoding method :
      >>
      >>my method call was :
      >>
      >>use Encode::Encoder qw/encoder/
      >>encoder($string)->bytes('UTF-8')->iso_8859_1;
      >>
      >>And this was giving me the error "\xE9 does not map to UTF-8". So this
      >>
      >>
      >
      >
      >
      >>why I thought that é was not a valid UTF-8 code.
      >>
      >>But then this might not be a SOAP problem but an encoder method
      >>
      >>
      >problem
      >
      >
      >>? Do you have any hint of why it refuses to read the string as an
      >>
      >>
      >UTF-8
      >
      >
      >>one then?
      >>
      >>I'm sorry because the more I learn on encoding the more I seem to get
      >>confused with it ;)
      >>
      >>Actually my goal is the following :
      >>
      >>My Web Service has to write in a database encoded in Latin-1. So I
      >>
      >>
      >have
      >
      >
      >>to encode the UTF-8 string to Latin-1, otherwise the data are not
      >>
      >>
      >stored
      >
      >
      >>correctly in the database. What would be the proper way to ensure that
      >>
      >>
      >
      >
      >
      >>whatever the SOAP client used (java, delphi, perl, php, ...) I will
      >>
      >>
      >get
      >
      >
      >>UTF-8 string that I can encode in Latin-1 in the PERL Web service?
      >>
      >>If this problem is not SOAP::Lite related, do you have any hints of a
      >>list where I could get help for it ? :)
      >>
      >>Note : Perl is 5.8 and is running under Apache1.3/mod_perl.
      >>
      >>Thanks a lot for your help and explanations,
      >>
      >>Best Regards,
      >>Cédric
      >>
      >>
      >>
      >My understanding is that all the parameters passed to your server class
      >will be marked as UTF-8 (because they have been through the XML
      >parser), so you should be able to convert a string to 8859-1 in this
      >way:
      >
      >my $octets = encode("iso-8859-1", $string, 1);
      >
      >this should throw an error if $string contains characters that are not
      >in 8859-1, so you will need to handle that event within an eval.
      >
      >Regards
      >
      >Duncan
      >
      >
      >
      >
      >___________________________________________________________
      >How much free photo storage do you get? Store your holiday
      >snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
      >
      >
      >
      >Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >


      --
      ---------------------------------------------------------------------
      BOUFFLERS Cédric : cedric.boufflers@...
      ---------------------------------------------------------------------
      NordNet - 111 Rue de Croix - 59510 Hem - France
      tél : +33 3 20 66 55 55 - fax : +33 3 20 66 55 59
      ---------------------------------------------------------------------
      http://www.securitoo.com/
      http://www.nordnet.fr/
      http://www.lerelaisinternet.com/
      ---------------------------------------------------------------------
    Your message has been successfully submitted and would be delivered to recipients shortly.