Loading ...
Sorry, an error occurred while loading the content.

Re: [XSL-FO] French characters in XML

Expand Messages
  • paul@precisiondocuments.com
    ... I recently had an opportunity to improve my knowledge of character encoding, and found the following article very helpful. It might be more than you want,
    Message 1 of 12 , May 21 10:01 AM
    • 0 Attachment
      On Tue, May 21, 2002 at 02:54:09PM +0200, Anders Svensson wrote:

      > I have a problem with foreign characters in xml documents.

      I recently had an opportunity to improve my knowledge of character
      encoding, and found the following article very helpful.

      It might be more than you want, and it might not even answer your
      specific question, but it has some good information.

      http://tronweb.super-nova.co.jp/characcodehist.html

      Later,
      --Paul
    • Vincent De Groote
      Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0). So, where do you see the
      Message 2 of 12 , May 22 12:21 AM
      • 0 Attachment
        Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
        -----Original Message-----
        From: Anders Svensson [mailto:asn@...]
        Sent: Tuesday, May 21, 2002 16:46
        To: XSL-FO@yahoogroups.com
        Subject: RE: [XSL-FO] French characters in XML

        Hi, Vincent!
         
        I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
         
        /Anders
         
        -----Original Message-----
        From: Vincent De Groote [mailto:vincent.degroote@...]
        Sent: den 21 maj 2002 16:21
        To: XSL-FO@yahoogroups.com
        Subject: RE: [XSL-FO] French characters in XML

        Look at the first line of your file.  If I understand your problem correctly, the first line should be
         
        <?xml version="1.0" encoding="ISO-8859-1"?>.
         
        You can also send me a sample of your file (to my private email) if you want.

        -----Original Message-----
        From: Anders Svensson [mailto:asn@...]
        Sent: Tuesday, May 21, 2002 15:50
        To: XSL-FO@yahoogroups.com
        Subject: RE: [XSL-FO] French characters in XML

        I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
         
        /Anders
         
        -----Original Message-----
        From: Vincent De Groote [mailto:vincent.degroote@...]
        Sent: den 21 maj 2002 15:19
        To: XSL-FO@yahoogroups.com
        Subject: RE: [XSL-FO] French characters in XML

        It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
         
        Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
         
        -----Original Message-----
        From: Anders Svensson [mailto:asn@...]
        Sent: Tuesday, May 21, 2002 14:54
        To: XSL-FO@yahoogroups.com
        Subject: [XSL-FO] French characters in XML

        I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

        Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

        Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

        Any help appreciated!

        /Anders


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
      • Anders Svensson
        The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I
        Message 3 of 12 , May 22 12:36 AM
        • 0 Attachment
          The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
           
          /Anders
           
          -----Original Message-----
          From: Vincent De Groote [mailto:vincent.degroote@...]
          Sent: den 22 maj 2002 09:22
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
          -----Original Message-----
          From: Anders Svensson [mailto:asn@...]
          Sent: Tuesday, May 21, 2002 16:46
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          Hi, Vincent!
           
          I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
           
          /Anders
           
          -----Original Message-----
          From: Vincent De Groote [mailto:vincent.degroote@...]
          Sent: den 21 maj 2002 16:21
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          Look at the first line of your file.  If I understand your problem correctly, the first line should be
           
          <?xml version="1.0" encoding="ISO-8859-1"?>.
           
          You can also send me a sample of your file (to my private email) if you want.

          -----Original Message-----
          From: Anders Svensson [mailto:asn@...]
          Sent: Tuesday, May 21, 2002 15:50
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
           
          /Anders
           
          -----Original Message-----
          From: Vincent De Groote [mailto:vincent.degroote@...]
          Sent: den 21 maj 2002 15:19
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
           
          Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
           
          -----Original Message-----
          From: Anders Svensson [mailto:asn@...]
          Sent: Tuesday, May 21, 2002 14:54
          To: XSL-FO@yahoogroups.com
          Subject: [XSL-FO] French characters in XML

          I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

          Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

          Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

          Any help appreciated!

          /Anders


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
        • Vincent De Groote
          Please forward me - the transformation file - the exact command you use to launch the transformation or - the result of the transformation using msxml vincent
          Message 4 of 12 , May 22 12:42 AM
          • 0 Attachment
            Please forward me
             
            - the transformation file
            - the exact command you use to launch the transformation
             
            or
             
            - the result of the transformation using msxml
             
            vincent
            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Wednesday, May 22, 2002 09:36
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
             
            /Anders
             
            -----Original Message-----
            From: Vincent De Groote [mailto:vincent.degroote@...]
            Sent: den 22 maj 2002 09:22
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Tuesday, May 21, 2002 16:46
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            Hi, Vincent!
             
            I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
             
            /Anders
             
            -----Original Message-----
            From: Vincent De Groote [mailto:vincent.degroote@...]
            Sent: den 21 maj 2002 16:21
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            Look at the first line of your file.  If I understand your problem correctly, the first line should be
             
            <?xml version="1.0" encoding="ISO-8859-1"?>.
             
            You can also send me a sample of your file (to my private email) if you want.

            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Tuesday, May 21, 2002 15:50
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
             
            /Anders
             
            -----Original Message-----
            From: Vincent De Groote [mailto:vincent.degroote@...]
            Sent: den 21 maj 2002 15:19
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
             
            Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
             
            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Tuesday, May 21, 2002 14:54
            To: XSL-FO@yahoogroups.com
            Subject: [XSL-FO] French characters in XML

            I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

            Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

            Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

            Any help appreciated!

            /Anders


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
          • Anders Svensson
            Unfortunately I m not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some
            Message 5 of 12 , May 22 1:04 AM
            • 0 Attachment
              Unfortunately I'm not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some clue as to what is going on...
               
              I can also send the VB code for executing the MSXML command:
               
              Sub Transform(pathVar, allFiles, resultName, xmlFile() As String)
               

              Dim Source As New MSXML2.DOMDocument
               
              Dim stylesheet As New MSXML2.DOMDocument
              newName = False
              If resultName <> "" Then
                  newName = True
              End If
               

              For i = 1 To allFiles
              ' Load data.
               
              Source.async = False
              Source.Load pathVar & xmlFile(i)
               
               
               
              ' Load style sheet.
              stylesheet.async = False
              stylesheet.Load xslPath
               
              ' Do the transform
              transformResult = Source.transformNode(stylesheet)
               
              'MsgBox transformResult
               

              'Print the results to a file
              If newName = False Then
                  lenXmlFile = Len(xmlFile(i))
                  cutFileName = Left(xmlFile(i), lenXmlFile - 4)
                  resultName = cutFileName & "." & fileFormat
              Else
                  resultName = resultName & "." & fileFormat
              End If
               

              Open pathVar & resultName For Output As #i
              Print #i, transformResult
              Close #i
              'End of printing
               
              Source.save (pathVar & xmlFile(i))
              Next i
               
               
               
              End Sub
               
              /Anders
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 22 maj 2002 09:42
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              Please forward me
               
              - the transformation file
              - the exact command you use to launch the transformation
               
              or
               
              - the result of the transformation using msxml
               
              vincent
              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Wednesday, May 22, 2002 09:36
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
               
              /Anders
               
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 22 maj 2002 09:22
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Tuesday, May 21, 2002 16:46
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              Hi, Vincent!
               
              I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
               
              /Anders
               
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 21 maj 2002 16:21
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              Look at the first line of your file.  If I understand your problem correctly, the first line should be
               
              <?xml version="1.0" encoding="ISO-8859-1"?>.
               
              You can also send me a sample of your file (to my private email) if you want.

              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Tuesday, May 21, 2002 15:50
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
               
              /Anders
               
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 21 maj 2002 15:19
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
               
              Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
               
              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Tuesday, May 21, 2002 14:54
              To: XSL-FO@yahoogroups.com
              Subject: [XSL-FO] French characters in XML

              I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

              Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

              Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

              Any help appreciated!

              /Anders


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
            • Vincent De Groote
              The encoding of the html file is invalid: look at the 3th line: encoding=UTF-16. Your file is not UTF-16 encoded (in utf16, most of the characters are coded
              Message 6 of 12 , May 22 1:28 AM
              • 0 Attachment
                The encoding of  the html file is invalid:  look at the 3th line:  encoding=UTF-16.
                 
                Your file is not UTF-16 encoded (in utf16, most of the characters are coded on 2 bytes).
                 
                For a proof:  Display the file in ie (ok, it looks strange).
                Then select View/Encoding/Western European (Windows)
                and it looks good.
                 
                Please check the encoding directives ( if any ) in your transformation file.  If you find a way to specify it, it should look like 'iso-8859-1'.
                 
                Another potential option is to give the same directives to the parser (may be an option at the save operation). I never used this parser, so I can't help you doing this..
                 
                Or to drop the wrong encoding specification after the file as been generated.  It seems to be displayed correctly without encoding specification.
                 
                Vincent
                 
                 
                 
                 
                 
                 
                 
                 
                 -----Original Message-----
                From: Anders Svensson [mailto:asn@...]
                Sent: Wednesday, May 22, 2002 10:04
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                Unfortunately I'm not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some clue as to what is going on...
                 
                I can also send the VB code for executing the MSXML command:
                 
                Sub Transform(pathVar, allFiles, resultName, xmlFile() As String)
                 

                Dim Source As New MSXML2.DOMDocument
                 
                Dim stylesheet As New MSXML2.DOMDocument
                newName = False
                If resultName <> "" Then
                    newName = True
                End If
                 

                For i = 1 To allFiles
                ' Load data.
                 
                Source.async = False
                Source.Load pathVar & xmlFile(i)
                 
                 
                 
                ' Load style sheet.
                stylesheet.async = False
                stylesheet.Load xslPath
                 
                ' Do the transform
                transformResult = Source.transformNode(stylesheet)
                 
                'MsgBox transformResult
                 

                'Print the results to a file
                If newName = False Then
                    lenXmlFile = Len(xmlFile(i))
                    cutFileName = Left(xmlFile(i), lenXmlFile - 4)
                    resultName = cutFileName & "." & fileFormat
                Else
                    resultName = resultName & "." & fileFormat
                End If
                 

                Open pathVar & resultName For Output As #i
                Print #i, transformResult
                Close #i
                'End of printing
                 
                Source.save (pathVar & xmlFile(i))
                Next i
                 
                 
                 
                End Sub
                 
                /Anders
                -----Original Message-----
                From: Vincent De Groote [mailto:vincent.degroote@...]
                Sent: den 22 maj 2002 09:42
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                Please forward me
                 
                - the transformation file
                - the exact command you use to launch the transformation
                 
                or
                 
                - the result of the transformation using msxml
                 
                vincent
                -----Original Message-----
                From: Anders Svensson [mailto:asn@...]
                Sent: Wednesday, May 22, 2002 09:36
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
                 
                /Anders
                 
                -----Original Message-----
                From: Vincent De Groote [mailto:vincent.degroote@...]
                Sent: den 22 maj 2002 09:22
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                -----Original Message-----
                From: Anders Svensson [mailto:asn@...]
                Sent: Tuesday, May 21, 2002 16:46
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                Hi, Vincent!
                 
                I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                 
                /Anders
                 
                -----Original Message-----
                From: Vincent De Groote [mailto:vincent.degroote@...]
                Sent: den 21 maj 2002 16:21
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                Look at the first line of your file.  If I understand your problem correctly, the first line should be
                 
                <?xml version="1.0" encoding="ISO-8859-1"?>.
                 
                You can also send me a sample of your file (to my private email) if you want.

                -----Original Message-----
                From: Anders Svensson [mailto:asn@...]
                Sent: Tuesday, May 21, 2002 15:50
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                 
                /Anders
                 
                -----Original Message-----
                From: Vincent De Groote [mailto:vincent.degroote@...]
                Sent: den 21 maj 2002 15:19
                To: XSL-FO@yahoogroups.com
                Subject: RE: [XSL-FO] French characters in XML

                It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                 
                Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                 
                -----Original Message-----
                From: Anders Svensson [mailto:asn@...]
                Sent: Tuesday, May 21, 2002 14:54
                To: XSL-FO@yahoogroups.com
                Subject: [XSL-FO] French characters in XML

                I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                Any help appreciated!

                /Anders


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                To unsubscribe from this group, send an email to:
                XSL-FO-unsubscribe@egroups.com



                Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
              • zile_yu
                Message 7 of 12 , Jun 5, 2002
                • 0 Attachment
                  <?xml version="1.0" encoding="utf-8"?>
                  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                  version="1.0">
                  <xsl:strip-space elements="*"/>

                  <xsl:output
                  method="html"
                  indent="yes"
                  encoding="utf-8"/>

                  In XSL script you can choose output encoding.
                  Did you use it. If not try.

                  Best regards,
                  Alex



                  --- In XSL-FO@y..., "Anders Svensson" <asn@e...> wrote:
                  > I have a problem with foreign characters in xml documents. I'm
                  guessing it's one of these unicode related problems that seem to be
                  recurring in one capricious form or another every once in a while...
                  I'm transforming the document with MSXML, but when I do it comes out
                  ruined with all kinds of asian characters and other signs like so (I
                  hope this renders correctly in the e-mail):
                  >
                  > Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les
                  diverses ex?tions de reconnaissance Cependant, en s?ctionnant une
                  colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre
                  les ex?tions. La table d?it comment reconna?e les colonnes.
                  >
                  > Does anyone know why and what to do about it? One thing to note is
                  that I did the same transformation in XML spy (which uses MSXML as
                  parser) and the text came out ok... Unfortunately doing the
                  transformations within xml spy is not an option since there are a
                  lot of files to transform. Also, it's not a particularly
                  satisfactory solution in the long run...:-)
                  >
                  > Any help appreciated!
                  >
                  > /Anders
                Your message has been successfully submitted and would be delivered to recipients shortly.