Loading ...
Sorry, an error occurred while loading the content.
 

French characters in XML

Expand Messages
  • Anders Svensson
    I have a problem with foreign characters in xml documents. I m guessing it s one of these unicode related problems that seem to be recurring in one capricious
    Message 1 of 12 , May 21 5:54 AM
      I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

      Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

      Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

      Any help appreciated!

      /Anders
    • Vincent De Groote
      It seems that, for example on the word differentes , 3 characters disappears. The e character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in
      Message 2 of 12 , May 21 6:18 AM
        It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
         
        Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
         
        -----Original Message-----
        From: Anders Svensson [mailto:asn@...]
        Sent: Tuesday, May 21, 2002 14:54
        To: XSL-FO@yahoogroups.com
        Subject: [XSL-FO] French characters in XML

        I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

        Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

        Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

        Any help appreciated!

        /Anders


        To unsubscribe from this group, send an email to:
        XSL-FO-unsubscribe@egroups.com



        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
      • Anders Svensson
        I don t think I understand encoding very well, so I m not sure what kind of encoding it really is... All I can tell is what the processing instruction says
        Message 3 of 12 , May 21 6:50 AM
          I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
           
          /Anders
           
          -----Original Message-----
          From: Vincent De Groote [mailto:vincent.degroote@...]
          Sent: den 21 maj 2002 15:19
          To: XSL-FO@yahoogroups.com
          Subject: RE: [XSL-FO] French characters in XML

          It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
           
          Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
           
          -----Original Message-----
          From: Anders Svensson [mailto:asn@...]
          Sent: Tuesday, May 21, 2002 14:54
          To: XSL-FO@yahoogroups.com
          Subject: [XSL-FO] French characters in XML

          I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

          Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

          Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

          Any help appreciated!

          /Anders


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


          To unsubscribe from this group, send an email to:
          XSL-FO-unsubscribe@egroups.com



          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
        • Vincent De Groote
          Look at the first line of your file. If I understand your problem correctly, the first line should be . You can
          Message 4 of 12 , May 21 7:21 AM
            Look at the first line of your file.  If I understand your problem correctly, the first line should be
             
            <?xml version="1.0" encoding="ISO-8859-1"?>.
             
            You can also send me a sample of your file (to my private email) if you want.

            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Tuesday, May 21, 2002 15:50
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
             
            /Anders
             
            -----Original Message-----
            From: Vincent De Groote [mailto:vincent.degroote@...]
            Sent: den 21 maj 2002 15:19
            To: XSL-FO@yahoogroups.com
            Subject: RE: [XSL-FO] French characters in XML

            It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
             
            Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
             
            -----Original Message-----
            From: Anders Svensson [mailto:asn@...]
            Sent: Tuesday, May 21, 2002 14:54
            To: XSL-FO@yahoogroups.com
            Subject: [XSL-FO] French characters in XML

            I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

            Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

            Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

            Any help appreciated!

            /Anders


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


            To unsubscribe from this group, send an email to:
            XSL-FO-unsubscribe@egroups.com



            Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
          • Anders Svensson
            Hi, Vincent! I m enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn t help... /Anders ... From: Vincent
            Message 5 of 12 , May 21 7:45 AM
              Hi, Vincent!
               
              I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
               
              /Anders
               
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 21 maj 2002 16:21
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              Look at the first line of your file.  If I understand your problem correctly, the first line should be
               
              <?xml version="1.0" encoding="ISO-8859-1"?>.
               
              You can also send me a sample of your file (to my private email) if you want.

              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Tuesday, May 21, 2002 15:50
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
               
              /Anders
               
              -----Original Message-----
              From: Vincent De Groote [mailto:vincent.degroote@...]
              Sent: den 21 maj 2002 15:19
              To: XSL-FO@yahoogroups.com
              Subject: RE: [XSL-FO] French characters in XML

              It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
               
              Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
               
              -----Original Message-----
              From: Anders Svensson [mailto:asn@...]
              Sent: Tuesday, May 21, 2002 14:54
              To: XSL-FO@yahoogroups.com
              Subject: [XSL-FO] French characters in XML

              I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

              Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

              Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

              Any help appreciated!

              /Anders


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


              To unsubscribe from this group, send an email to:
              XSL-FO-unsubscribe@egroups.com



              Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
            • paul@precisiondocuments.com
              ... I recently had an opportunity to improve my knowledge of character encoding, and found the following article very helpful. It might be more than you want,
              Message 6 of 12 , May 21 10:01 AM
                On Tue, May 21, 2002 at 02:54:09PM +0200, Anders Svensson wrote:

                > I have a problem with foreign characters in xml documents.

                I recently had an opportunity to improve my knowledge of character
                encoding, and found the following article very helpful.

                It might be more than you want, and it might not even answer your
                specific question, but it has some good information.

                http://tronweb.super-nova.co.jp/characcodehist.html

                Later,
                --Paul
              • Vincent De Groote
                Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0). So, where do you see the
                Message 7 of 12 , May 22 12:21 AM
                  Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                  -----Original Message-----
                  From: Anders Svensson [mailto:asn@...]
                  Sent: Tuesday, May 21, 2002 16:46
                  To: XSL-FO@yahoogroups.com
                  Subject: RE: [XSL-FO] French characters in XML

                  Hi, Vincent!
                   
                  I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                   
                  /Anders
                   
                  -----Original Message-----
                  From: Vincent De Groote [mailto:vincent.degroote@...]
                  Sent: den 21 maj 2002 16:21
                  To: XSL-FO@yahoogroups.com
                  Subject: RE: [XSL-FO] French characters in XML

                  Look at the first line of your file.  If I understand your problem correctly, the first line should be
                   
                  <?xml version="1.0" encoding="ISO-8859-1"?>.
                   
                  You can also send me a sample of your file (to my private email) if you want.

                  -----Original Message-----
                  From: Anders Svensson [mailto:asn@...]
                  Sent: Tuesday, May 21, 2002 15:50
                  To: XSL-FO@yahoogroups.com
                  Subject: RE: [XSL-FO] French characters in XML

                  I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                   
                  /Anders
                   
                  -----Original Message-----
                  From: Vincent De Groote [mailto:vincent.degroote@...]
                  Sent: den 21 maj 2002 15:19
                  To: XSL-FO@yahoogroups.com
                  Subject: RE: [XSL-FO] French characters in XML

                  It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                   
                  Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                   
                  -----Original Message-----
                  From: Anders Svensson [mailto:asn@...]
                  Sent: Tuesday, May 21, 2002 14:54
                  To: XSL-FO@yahoogroups.com
                  Subject: [XSL-FO] French characters in XML

                  I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                  Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                  Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                  Any help appreciated!

                  /Anders


                  To unsubscribe from this group, send an email to:
                  XSL-FO-unsubscribe@egroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                  To unsubscribe from this group, send an email to:
                  XSL-FO-unsubscribe@egroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                  To unsubscribe from this group, send an email to:
                  XSL-FO-unsubscribe@egroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                  To unsubscribe from this group, send an email to:
                  XSL-FO-unsubscribe@egroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                  To unsubscribe from this group, send an email to:
                  XSL-FO-unsubscribe@egroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                • Anders Svensson
                  The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I
                  Message 8 of 12 , May 22 12:36 AM
                    The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
                     
                    /Anders
                     
                    -----Original Message-----
                    From: Vincent De Groote [mailto:vincent.degroote@...]
                    Sent: den 22 maj 2002 09:22
                    To: XSL-FO@yahoogroups.com
                    Subject: RE: [XSL-FO] French characters in XML

                    Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                    -----Original Message-----
                    From: Anders Svensson [mailto:asn@...]
                    Sent: Tuesday, May 21, 2002 16:46
                    To: XSL-FO@yahoogroups.com
                    Subject: RE: [XSL-FO] French characters in XML

                    Hi, Vincent!
                     
                    I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                     
                    /Anders
                     
                    -----Original Message-----
                    From: Vincent De Groote [mailto:vincent.degroote@...]
                    Sent: den 21 maj 2002 16:21
                    To: XSL-FO@yahoogroups.com
                    Subject: RE: [XSL-FO] French characters in XML

                    Look at the first line of your file.  If I understand your problem correctly, the first line should be
                     
                    <?xml version="1.0" encoding="ISO-8859-1"?>.
                     
                    You can also send me a sample of your file (to my private email) if you want.

                    -----Original Message-----
                    From: Anders Svensson [mailto:asn@...]
                    Sent: Tuesday, May 21, 2002 15:50
                    To: XSL-FO@yahoogroups.com
                    Subject: RE: [XSL-FO] French characters in XML

                    I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                     
                    /Anders
                     
                    -----Original Message-----
                    From: Vincent De Groote [mailto:vincent.degroote@...]
                    Sent: den 21 maj 2002 15:19
                    To: XSL-FO@yahoogroups.com
                    Subject: RE: [XSL-FO] French characters in XML

                    It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                     
                    Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                     
                    -----Original Message-----
                    From: Anders Svensson [mailto:asn@...]
                    Sent: Tuesday, May 21, 2002 14:54
                    To: XSL-FO@yahoogroups.com
                    Subject: [XSL-FO] French characters in XML

                    I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                    Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                    Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                    Any help appreciated!

                    /Anders


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                    To unsubscribe from this group, send an email to:
                    XSL-FO-unsubscribe@egroups.com



                    Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                  • Vincent De Groote
                    Please forward me - the transformation file - the exact command you use to launch the transformation or - the result of the transformation using msxml vincent
                    Message 9 of 12 , May 22 12:42 AM
                      Please forward me
                       
                      - the transformation file
                      - the exact command you use to launch the transformation
                       
                      or
                       
                      - the result of the transformation using msxml
                       
                      vincent
                      -----Original Message-----
                      From: Anders Svensson [mailto:asn@...]
                      Sent: Wednesday, May 22, 2002 09:36
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
                       
                      /Anders
                       
                      -----Original Message-----
                      From: Vincent De Groote [mailto:vincent.degroote@...]
                      Sent: den 22 maj 2002 09:22
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                      -----Original Message-----
                      From: Anders Svensson [mailto:asn@...]
                      Sent: Tuesday, May 21, 2002 16:46
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      Hi, Vincent!
                       
                      I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                       
                      /Anders
                       
                      -----Original Message-----
                      From: Vincent De Groote [mailto:vincent.degroote@...]
                      Sent: den 21 maj 2002 16:21
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      Look at the first line of your file.  If I understand your problem correctly, the first line should be
                       
                      <?xml version="1.0" encoding="ISO-8859-1"?>.
                       
                      You can also send me a sample of your file (to my private email) if you want.

                      -----Original Message-----
                      From: Anders Svensson [mailto:asn@...]
                      Sent: Tuesday, May 21, 2002 15:50
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                       
                      /Anders
                       
                      -----Original Message-----
                      From: Vincent De Groote [mailto:vincent.degroote@...]
                      Sent: den 21 maj 2002 15:19
                      To: XSL-FO@yahoogroups.com
                      Subject: RE: [XSL-FO] French characters in XML

                      It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                       
                      Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                       
                      -----Original Message-----
                      From: Anders Svensson [mailto:asn@...]
                      Sent: Tuesday, May 21, 2002 14:54
                      To: XSL-FO@yahoogroups.com
                      Subject: [XSL-FO] French characters in XML

                      I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                      Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                      Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                      Any help appreciated!

                      /Anders


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                      To unsubscribe from this group, send an email to:
                      XSL-FO-unsubscribe@egroups.com



                      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                    • Anders Svensson
                      Unfortunately I m not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some
                      Message 10 of 12 , May 22 1:04 AM
                        Unfortunately I'm not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some clue as to what is going on...
                         
                        I can also send the VB code for executing the MSXML command:
                         
                        Sub Transform(pathVar, allFiles, resultName, xmlFile() As String)
                         

                        Dim Source As New MSXML2.DOMDocument
                         
                        Dim stylesheet As New MSXML2.DOMDocument
                        newName = False
                        If resultName <> "" Then
                            newName = True
                        End If
                         

                        For i = 1 To allFiles
                        ' Load data.
                         
                        Source.async = False
                        Source.Load pathVar & xmlFile(i)
                         
                         
                         
                        ' Load style sheet.
                        stylesheet.async = False
                        stylesheet.Load xslPath
                         
                        ' Do the transform
                        transformResult = Source.transformNode(stylesheet)
                         
                        'MsgBox transformResult
                         

                        'Print the results to a file
                        If newName = False Then
                            lenXmlFile = Len(xmlFile(i))
                            cutFileName = Left(xmlFile(i), lenXmlFile - 4)
                            resultName = cutFileName & "." & fileFormat
                        Else
                            resultName = resultName & "." & fileFormat
                        End If
                         

                        Open pathVar & resultName For Output As #i
                        Print #i, transformResult
                        Close #i
                        'End of printing
                         
                        Source.save (pathVar & xmlFile(i))
                        Next i
                         
                         
                         
                        End Sub
                         
                        /Anders
                        -----Original Message-----
                        From: Vincent De Groote [mailto:vincent.degroote@...]
                        Sent: den 22 maj 2002 09:42
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        Please forward me
                         
                        - the transformation file
                        - the exact command you use to launch the transformation
                         
                        or
                         
                        - the result of the transformation using msxml
                         
                        vincent
                        -----Original Message-----
                        From: Anders Svensson [mailto:asn@...]
                        Sent: Wednesday, May 22, 2002 09:36
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
                         
                        /Anders
                         
                        -----Original Message-----
                        From: Vincent De Groote [mailto:vincent.degroote@...]
                        Sent: den 22 maj 2002 09:22
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                        -----Original Message-----
                        From: Anders Svensson [mailto:asn@...]
                        Sent: Tuesday, May 21, 2002 16:46
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        Hi, Vincent!
                         
                        I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                         
                        /Anders
                         
                        -----Original Message-----
                        From: Vincent De Groote [mailto:vincent.degroote@...]
                        Sent: den 21 maj 2002 16:21
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        Look at the first line of your file.  If I understand your problem correctly, the first line should be
                         
                        <?xml version="1.0" encoding="ISO-8859-1"?>.
                         
                        You can also send me a sample of your file (to my private email) if you want.

                        -----Original Message-----
                        From: Anders Svensson [mailto:asn@...]
                        Sent: Tuesday, May 21, 2002 15:50
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                         
                        /Anders
                         
                        -----Original Message-----
                        From: Vincent De Groote [mailto:vincent.degroote@...]
                        Sent: den 21 maj 2002 15:19
                        To: XSL-FO@yahoogroups.com
                        Subject: RE: [XSL-FO] French characters in XML

                        It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                         
                        Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                         
                        -----Original Message-----
                        From: Anders Svensson [mailto:asn@...]
                        Sent: Tuesday, May 21, 2002 14:54
                        To: XSL-FO@yahoogroups.com
                        Subject: [XSL-FO] French characters in XML

                        I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                        Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                        Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                        Any help appreciated!

                        /Anders


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                        To unsubscribe from this group, send an email to:
                        XSL-FO-unsubscribe@egroups.com



                        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                      • Vincent De Groote
                        The encoding of the html file is invalid: look at the 3th line: encoding=UTF-16. Your file is not UTF-16 encoded (in utf16, most of the characters are coded
                        Message 11 of 12 , May 22 1:28 AM
                          The encoding of  the html file is invalid:  look at the 3th line:  encoding=UTF-16.
                           
                          Your file is not UTF-16 encoded (in utf16, most of the characters are coded on 2 bytes).
                           
                          For a proof:  Display the file in ie (ok, it looks strange).
                          Then select View/Encoding/Western European (Windows)
                          and it looks good.
                           
                          Please check the encoding directives ( if any ) in your transformation file.  If you find a way to specify it, it should look like 'iso-8859-1'.
                           
                          Another potential option is to give the same directives to the parser (may be an option at the save operation). I never used this parser, so I can't help you doing this..
                           
                          Or to drop the wrong encoding specification after the file as been generated.  It seems to be displayed correctly without encoding specification.
                           
                          Vincent
                           
                           
                           
                           
                           
                           
                           
                           
                           -----Original Message-----
                          From: Anders Svensson [mailto:asn@...]
                          Sent: Wednesday, May 22, 2002 10:04
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          Unfortunately I'm not allowed to send transformation files due to company restrictions, but I can send the transformed HTML file. I hope that can give some clue as to what is going on...
                           
                          I can also send the VB code for executing the MSXML command:
                           
                          Sub Transform(pathVar, allFiles, resultName, xmlFile() As String)
                           

                          Dim Source As New MSXML2.DOMDocument
                           
                          Dim stylesheet As New MSXML2.DOMDocument
                          newName = False
                          If resultName <> "" Then
                              newName = True
                          End If
                           

                          For i = 1 To allFiles
                          ' Load data.
                           
                          Source.async = False
                          Source.Load pathVar & xmlFile(i)
                           
                           
                           
                          ' Load style sheet.
                          stylesheet.async = False
                          stylesheet.Load xslPath
                           
                          ' Do the transform
                          transformResult = Source.transformNode(stylesheet)
                           
                          'MsgBox transformResult
                           

                          'Print the results to a file
                          If newName = False Then
                              lenXmlFile = Len(xmlFile(i))
                              cutFileName = Left(xmlFile(i), lenXmlFile - 4)
                              resultName = cutFileName & "." & fileFormat
                          Else
                              resultName = resultName & "." & fileFormat
                          End If
                           

                          Open pathVar & resultName For Output As #i
                          Print #i, transformResult
                          Close #i
                          'End of printing
                           
                          Source.save (pathVar & xmlFile(i))
                          Next i
                           
                           
                           
                          End Sub
                           
                          /Anders
                          -----Original Message-----
                          From: Vincent De Groote [mailto:vincent.degroote@...]
                          Sent: den 22 maj 2002 09:42
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          Please forward me
                           
                          - the transformation file
                          - the exact command you use to launch the transformation
                           
                          or
                           
                          - the result of the transformation using msxml
                           
                          vincent
                          -----Original Message-----
                          From: Anders Svensson [mailto:asn@...]
                          Sent: Wednesday, May 22, 2002 09:36
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          The problem is that it displays correctly when looking at it directly in IE, and when transforming it inside XML Spy (also using MSXML), but not when I transform files using MSXML directly as a standalone parser/transformer (see the example below - which actually displays as asian characters where there are question marks in this e-mail). Like I said, I'm completely baffled by the phenomenon myself. I have no idea why it would be different when I transform the document directly in MSXML compared to when doing it inside XML Spy, but it is. The reason I have to do transformations directly in MSXML is I need to do multiple transformations of hundreds of files, so doing it inside XML Spy is not a viable option...
                           
                          /Anders
                           
                          -----Original Message-----
                          From: Vincent De Groote [mailto:vincent.degroote@...]
                          Sent: den 22 maj 2002 09:22
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          Your input file seems to be correct (= with correct encoding informations) and is correctly displayed in ie (which uses msxml 3.0).  So, where do you see the problem ?
                          -----Original Message-----
                          From: Anders Svensson [mailto:asn@...]
                          Sent: Tuesday, May 21, 2002 16:46
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          Hi, Vincent!
                           
                          I'm enclosing a sample file. I tried changing the encoding specification at the top of the file, but it didn't help...
                           
                          /Anders
                           
                          -----Original Message-----
                          From: Vincent De Groote [mailto:vincent.degroote@...]
                          Sent: den 21 maj 2002 16:21
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          Look at the first line of your file.  If I understand your problem correctly, the first line should be
                           
                          <?xml version="1.0" encoding="ISO-8859-1"?>.
                           
                          You can also send me a sample of your file (to my private email) if you want.

                          -----Original Message-----
                          From: Anders Svensson [mailto:asn@...]
                          Sent: Tuesday, May 21, 2002 15:50
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          I don't think I understand encoding very well, so I'm not sure what kind of encoding it "really" is... All I can tell is what the processing instruction says at the top of the file, and the fact that when I look at the xml file it looks like an ok text file and that the xml tagging is correct. But if some sort of hidden encoding lurks beneath the surface somehow, then I have no idea how that works and what to do about it...
                           
                          /Anders
                           
                          -----Original Message-----
                          From: Vincent De Groote [mailto:vincent.degroote@...]
                          Sent: den 21 maj 2002 15:19
                          To: XSL-FO@yahoogroups.com
                          Subject: RE: [XSL-FO] French characters in XML

                          It seems that, for example on the word "différentes", 3 characters disappears.  The é character is usually coded e9 in hexadecimal, and e9 is 1110 1001 in binary.  So the high nibble makes your parser interpret the 3 bytes as a single character, (UTF-8 encoding).
                           
                          Are you sure the encoding informations of your xml file are correct ?  Isn't it an iso-latin encoded file ? 
                           
                          -----Original Message-----
                          From: Anders Svensson [mailto:asn@...]
                          Sent: Tuesday, May 21, 2002 14:54
                          To: XSL-FO@yahoogroups.com
                          Subject: [XSL-FO] French characters in XML

                          I have a problem with foreign characters in xml documents. I'm guessing it's one of these unicode related problems that seem to be recurring in one capricious form or another every once in a while... I'm transforming the document with MSXML, but when I do it comes out ruined with all kinds of asian characters and other signs like so (I hope this renders correctly in the e-mail):

                          Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les diverses ex?tions de reconnaissance Cependant, en s?ctionnant une colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre les ex?tions. La table d?it comment reconna?e les colonnes.

                          Does anyone know why and what to do about it? One thing to note is that I did the same transformation in XML spy (which uses MSXML as parser) and the text came out ok... Unfortunately doing the transformations within xml spy is not an option since there are a lot of files to transform. Also, it's not a particularly satisfactory solution in the long run...:-)

                          Any help appreciated!

                          /Anders


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


                          To unsubscribe from this group, send an email to:
                          XSL-FO-unsubscribe@egroups.com



                          Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                        • zile_yu
                          Message 12 of 12 , Jun 5, 2002
                            <?xml version="1.0" encoding="utf-8"?>
                            <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                            version="1.0">
                            <xsl:strip-space elements="*"/>

                            <xsl:output
                            method="html"
                            indent="yes"
                            encoding="utf-8"/>

                            In XSL script you can choose output encoding.
                            Did you use it. If not try.

                            Best regards,
                            Alex



                            --- In XSL-FO@y..., "Anders Svensson" <asn@e...> wrote:
                            > I have a problem with foreign characters in xml documents. I'm
                            guessing it's one of these unicode related problems that seem to be
                            recurring in one capricious form or another every once in a while...
                            I'm transforming the document with MSXML, but when I do it comes out
                            ruined with all kinds of asian characters and other signs like so (I
                            hope this renders correctly in the e-mail):
                            >
                            > Vous pouvez d?nir des colonnes diff?ntes ?tiliser dans les
                            diverses ex?tions de reconnaissance Cependant, en s?ctionnant une
                            colonne diff?nte, d'autres variables peuvent aussi ?e chang? entre
                            les ex?tions. La table d?it comment reconna?e les colonnes.
                            >
                            > Does anyone know why and what to do about it? One thing to note is
                            that I did the same transformation in XML spy (which uses MSXML as
                            parser) and the text came out ok... Unfortunately doing the
                            transformations within xml spy is not an option since there are a
                            lot of files to transform. Also, it's not a particularly
                            satisfactory solution in the long run...:-)
                            >
                            > Any help appreciated!
                            >
                            > /Anders
                          Your message has been successfully submitted and would be delivered to recipients shortly.