Re: [soaplite] Useful module to use with SOAP::Lite
- Hi Duncan,
> > Using SOAP::Lite intensively (thanks Paul!), I had to deal with twoIf you concatenate a utf8 flagged string with a non utf8 flagged string,
> > minor issues:
> > - soap calls return utf8 strings, even if the content doesn't have any
> > utf8 encoded characters. This is a "feature" of the XML parser. This
> > be annoying when you put this utf8 data in an HTML template and
> > an iso=* encoding in the HTML header. The result will be a utf8 page
> > rendered as a iso-* content, which is messy. Most XML related module
> > produce the same "effect".
> Not sure that I understand the problem that you are describing. A utf8
> string that has no multi-byte sequences is identical to an ASCII string
> as it is limited to 7 bits and should then be compatible with ISO-8859
the result will be a utf8 flagged string. The problems happens when you
display an html page and you specify a charset iso-xxxx because you
expect your content to be iso-xxx. If you mix strings from SOAP::Lite
(flagged utf8 - even if it doesn't contain any utf8 characters) and some
other content (non utf8), the result is a utf8 content.
The browser will receive a utf8 encoded html page with a html header
telling him that is is actually iso-xxxx encoded. The result is bad.
> > - When you send data structures containing blessed referencesI should update the doc then.
> > the encoded XML is different than if the data was not blessed. This
> > have some undesirable side effects with non perl SOAP implementations.
> > So, here is a new module Data::Structure::Util:
> > It enables, amongst other things, to encode/decode any string to/from
> > utf8 within a data structure. It also enables to remove the blessing
> > any reference within the structure.
> Looking at the POD for the module it is not clear what the utf8_on() and
> off() routines do. Do they simply set or clear the utf8 flag attached to
> each scalar? Or do they do more than that?
They do more than that, they actually call the perl API to encode/decode (if possible) the string.
The flag will be set/unset as the result of the transformation.
- At 21:02:31 on 2003-11-05 Randy J. Ray <rjray@...> wrote:
>> Not sure that I understand the problem that you are describing. A utf8That's right but Pierre was referring to utf-8 encoded strings that
>> string that has no multi-byte sequences is identical to an ASCII string
>> as it is limited to 7 bits and should then be compatible with ISO-8859
>UTF-8 and ISO-8859-1 overlap, but are not identical.
don't have any utf-8 characters. I took that to mean only single byte
character sequences, in effect 7 bit characters.
>The problem is not with the parser module, it's with the XML specification.I am still not convinced by Pierre's approach of messing with the
>The XML spec says that a document's encoding defaults to UTF-8 in the absence
>of an explicit declaration.
>Most parsers take this a step further and convert the text nodes they return
>to the application to UTF-8 even when the document is explicitly encoded
>otherwise. While this can be convenient, it's also consistent and predictable
>behavior. In an application I recently wrote with XML::Parser (that did not
>involve SOAP or XML-RPC, but did have to deal with encoding issues), I used
>the "use bytes" pragma, and was able to do what I needed with the text data,
>with no problems.
utf-8 flag on a variable. I think that explicitly converting the
utf-8 output from SOAP::Lite into whatever encoding he wants for his
web page is cleaner and safer.