Loading ...
Sorry, an error occurred while loading the content.

Re: [soaplite] Soap and encoding of non ASCII literals

Expand Messages
  • Duncan Cameron
    ... problem ... UTF-8 ... have ... stored ... get ... My understanding is that all the parameters passed to your server class will be marked as UTF-8 (because
    Message 1 of 6 , Jun 16, 2005
    • 0 Attachment
      At 2005-06-16, 14:05:55 you wrote:
      >
      >Hello Duncan,
      >
      >So if understand well I might be trying to double encode the strings.
      >
      >But what has made me done that, and might have misleading me, is the
      >error I was getting from the encoding method :
      >
      >my method call was :
      >
      >use Encode::Encoder qw/encoder/
      >encoder($string)->bytes('UTF-8')->iso_8859_1;
      >
      >And this was giving me the error "\xE9 does not map to UTF-8". So this

      >why I thought that é was not a valid UTF-8 code.
      >
      >But then this might not be a SOAP problem but an encoder method
      problem
      >? Do you have any hint of why it refuses to read the string as an
      UTF-8
      >one then?
      >
      >I'm sorry because the more I learn on encoding the more I seem to get
      >confused with it ;)
      >
      >Actually my goal is the following :
      >
      >My Web Service has to write in a database encoded in Latin-1. So I
      have
      >to encode the UTF-8 string to Latin-1, otherwise the data are not
      stored
      >correctly in the database. What would be the proper way to ensure that

      >whatever the SOAP client used (java, delphi, perl, php, ...) I will
      get
      >UTF-8 string that I can encode in Latin-1 in the PERL Web service?
      >
      >If this problem is not SOAP::Lite related, do you have any hints of a
      >list where I could get help for it ? :)
      >
      >Note : Perl is 5.8 and is running under Apache1.3/mod_perl.
      >
      >Thanks a lot for your help and explanations,
      >
      >Best Regards,
      >Cédric
      >
      My understanding is that all the parameters passed to your server class
      will be marked as UTF-8 (because they have been through the XML
      parser), so you should be able to convert a string to 8859-1 in this
      way:

      my $octets = encode("iso-8859-1", $string, 1);

      this should throw an error if $string contains characters that are not
      in 8859-1, so you will need to handle that event within an eval.

      Regards

      Duncan




      ___________________________________________________________
      How much free photo storage do you get? Store your holiday
      snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
    • cedric.boufflers
      Hello Duncan and the list readers, I have been doing some experimentation. I have written a simple Web service in PERL : This is my method : sub get_Champs {
      Message 2 of 6 , Jun 24, 2005
      • 0 Attachment
        Hello Duncan and the list readers,

        I have been doing some experimentation. I have written a simple Web
        service in PERL :

        This is my method :

        sub get_Champs
        {
        my $class = shift;
        my $envelope = pop;

        my $champ = $envelope->valueof("//get_Champs/Champ");

        use Data::HexDump;

        return SOAP::Data->name('result' => HexDump($champ));
        }

        It just return an Hexadecimal dump of the string received.

        I have called it with a Java standard Java client :


        *- First Test*
        System.out.println(wsenc.get_Champs("cédric"));

        In response I had :
        00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF

        00000000 63 E9 64 72 69 63 c.dric

        So it seems my accent is encoded on a single byte there, and PERL does
        not deal the string in UTF-8 in this case.


        *- Second Test*
        System.out.println(wsenc.get_Champs(new
        String("cédric".getBytes("UTF-8"))));

        In response I had :
        00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F
        0123456789ABCDEF

        00000000 63 C3 A9 64 72 69 63 c..dric

        In this case I have a double bytes encoded accentued character. Is it
        because in this case I am doing double encoding? Although in this case
        in PERL it is seen as an UTF-8 string.

        How could I force PERL or SOAP::Lite to always deal with the string in
        UTF-8 ?

        I have tried to add this lines :

        use POSIX qw(locale_h);
        setlocale(LC_CTYPE, "en_US.UTF-8");

        But it changes nothing and the default locale of the computer is :
        LANG=en_US.UTF-8
        LANGVAR=en_US.UTF-8

        But nothing does.

        Best Regards,
        And thank you for your help.

        Cédric

        Note :
        SOAP::Lite is 0.60
        Perl is perl5 (revision 5.0 version 8 subversion 0).



        Duncan Cameron a écrit :

        >At 2005-06-16, 14:05:55 you wrote:
        >
        >
        >>Hello Duncan,
        >>
        >>So if understand well I might be trying to double encode the strings.
        >>
        >>But what has made me done that, and might have misleading me, is the
        >>error I was getting from the encoding method :
        >>
        >>my method call was :
        >>
        >>use Encode::Encoder qw/encoder/
        >>encoder($string)->bytes('UTF-8')->iso_8859_1;
        >>
        >>And this was giving me the error "\xE9 does not map to UTF-8". So this
        >>
        >>
        >
        >
        >
        >>why I thought that é was not a valid UTF-8 code.
        >>
        >>But then this might not be a SOAP problem but an encoder method
        >>
        >>
        >problem
        >
        >
        >>? Do you have any hint of why it refuses to read the string as an
        >>
        >>
        >UTF-8
        >
        >
        >>one then?
        >>
        >>I'm sorry because the more I learn on encoding the more I seem to get
        >>confused with it ;)
        >>
        >>Actually my goal is the following :
        >>
        >>My Web Service has to write in a database encoded in Latin-1. So I
        >>
        >>
        >have
        >
        >
        >>to encode the UTF-8 string to Latin-1, otherwise the data are not
        >>
        >>
        >stored
        >
        >
        >>correctly in the database. What would be the proper way to ensure that
        >>
        >>
        >
        >
        >
        >>whatever the SOAP client used (java, delphi, perl, php, ...) I will
        >>
        >>
        >get
        >
        >
        >>UTF-8 string that I can encode in Latin-1 in the PERL Web service?
        >>
        >>If this problem is not SOAP::Lite related, do you have any hints of a
        >>list where I could get help for it ? :)
        >>
        >>Note : Perl is 5.8 and is running under Apache1.3/mod_perl.
        >>
        >>Thanks a lot for your help and explanations,
        >>
        >>Best Regards,
        >>Cédric
        >>
        >>
        >>
        >My understanding is that all the parameters passed to your server class
        >will be marked as UTF-8 (because they have been through the XML
        >parser), so you should be able to convert a string to 8859-1 in this
        >way:
        >
        >my $octets = encode("iso-8859-1", $string, 1);
        >
        >this should throw an error if $string contains characters that are not
        >in 8859-1, so you will need to handle that event within an eval.
        >
        >Regards
        >
        >Duncan
        >
        >
        >
        >
        >___________________________________________________________
        >How much free photo storage do you get? Store your holiday
        >snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
        >
        >
        >
        >Yahoo! Groups Links
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >


        --
        ---------------------------------------------------------------------
        BOUFFLERS Cédric : cedric.boufflers@...
        ---------------------------------------------------------------------
        NordNet - 111 Rue de Croix - 59510 Hem - France
        tél : +33 3 20 66 55 55 - fax : +33 3 20 66 55 59
        ---------------------------------------------------------------------
        http://www.securitoo.com/
        http://www.nordnet.fr/
        http://www.lerelaisinternet.com/
        ---------------------------------------------------------------------
      Your message has been successfully submitted and would be delivered to recipients shortly.