Loading ...
Sorry, an error occurred while loading the content.

JSON.pm perl module, utf-8 and the UTF8 flag

Expand Messages
  • Martin J. Evans
    Hi, I can see that a $JSON::UTF8 flag has been added to 1.07 of JSON.pm which is described as: You can set a true value into $JSON::UTF8 for JSON::Parser and
    Message 1 of 3 , Sep 28, 2006
    • 0 Attachment
      Hi,

      I can see that a $JSON::UTF8 flag has been added to 1.07 of JSON.pm which is
      described as:

      You can set a true value into $JSON::UTF8 for JSON::Parser and
      JSON::Converter to set UTF8 flag into strings contain utf8.

      Does anyone know exactly what this does?

      I was using JSON.pm 1.00 quite happily with unicode like this:

      use strict;
      use JSON;
      use Encode;
      use Data::Dumper;

      my $str = "\x{20ac}fred";
      print Dumper([$str]);
      print "str:utf8::is_utf8 = ", utf8::is_utf8($str) ? 1 : 0, "\n";
      print "str dump: ", join(" ", unpack("H*", $str)), "\n";

      my $obj = [ $str ];
      my $j = new JSON;
      my $js = $j->objToJson($obj);
      print "js:utf8::is_utf8 = ", utf8::is_utf8($js) ? 1 : 0, "\n";
      print "js dump: ", join(" ", unpack("H*", $js)), "\n";
      open OUT, ">json.out";
      binmode(OUT, ":utf8");
      print OUT "$js\n";
      close OUT;

      open IN, "<json.out";
      binmode (IN, ":utf8");
      my $fd="";
      $fd .= $_ while (<IN>);
      my $j = new JSON;
      my $obj = $j->jsonToObj($fd);
      print Dumper($obj);
      print "obj:utf8::is_utf8 = ", utf8::is_utf8($obj->[0]) ? 1 : 0, "\n";
      print "obj dump: ", join(" ", unpack("H*", $obj->[0])), "\n";

      which quite happily maintains utf-8 characters as shown by the output:

      $VAR1 = [
      "\x{20ac}fred"
      ];
      str:utf8::is_utf8 = 1
      str dump: e282ac66726564
      js:utf8::is_utf8 = 1
      js dump: 5b22e282ac66726564225d
      $VAR1 = [
      "\x{20ac}fred"
      ];
      obj:utf8::is_utf8 = 1
      obj dump: e282ac66726564

      Under what circumstances would I want to set $JSON::UTF8?

      Thanks

      Martin
      --
      Martin J. Evans
      Easysoft Ltd, UK
      http://www.easysoft.com
    • Philip Tellis
      ... I ve had problems with JSON.pm and UTF8 on perl 5.6. I d submitted a patch about this here: http://rt.cpan.org/Public/Bug/Display.html?id=19646 OTOH,
      Message 2 of 3 , Sep 28, 2006
      • 0 Attachment
        Sometime Today, MJE cobbled together some glyphs to say:

        > Hi,
        >
        > I can see that a $JSON::UTF8 flag has been added to 1.07 of JSON.pm which is
        > described as:
        >
        > You can set a true value into $JSON::UTF8 for JSON::Parser and
        > JSON::Converter to set UTF8 flag into strings contain utf8.

        I've had problems with JSON.pm and UTF8 on perl 5.6. I'd submitted a
        patch about this here:
        http://rt.cpan.org/Public/Bug/Display.html?id=19646

        OTOH, we've seen that JSON.pm has really bad performance. YAML::Syck
        appears to be far more performant, probably because it's mainly written
        in C.

        Philip

        --
        Order and simplification are the first steps toward mastery of a subject
        -- the actual enemy is the unknown.
        -- Thomas Mann
      • Martin J. Evans
        ... Thanks for the reference Philip. I am actually using perl 5.8.8 so I had not seen any problem. I tried JSON::Syck and have seen it is faster but I didn t
        Message 3 of 3 , Sep 28, 2006
        • 0 Attachment
          On 28-Sep-2006 Philip Tellis wrote:
          > Sometime Today, MJE cobbled together some glyphs to say:
          >
          >> Hi,
          >>
          >> I can see that a $JSON::UTF8 flag has been added to 1.07 of JSON.pm which is
          >> described as:
          >>
          >> You can set a true value into $JSON::UTF8 for JSON::Parser and
          >> JSON::Converter to set UTF8 flag into strings contain utf8.
          >
          > I've had problems with JSON.pm and UTF8 on perl 5.6. I'd submitted a
          > patch about this here:
          > http://rt.cpan.org/Public/Bug/Display.html?id=19646
          >
          > OTOH, we've seen that JSON.pm has really bad performance. YAML::Syck
          > appears to be far more performant, probably because it's mainly written
          > in C.
          >
          > Philip

          Thanks for the reference Philip.

          I am actually using perl 5.8.8 so I had not seen any problem.

          I tried JSON::Syck and have seen it is faster but I didn't like the fact that
          the default behavior does not work with utf-8 strings i.e. If you change the
          script I previously posted to use JSON::Syck when the utf-8 encoded JSON data
          is read back from the file it is now double utf-8 encoded. I realise that
          $JSON::Syck::ImplicitUnicode works around this.

          Martin
          --
          Martin J. Evans
          Easysoft Ltd, UK
          http://www.easysoft.com
        Your message has been successfully submitted and would be delivered to recipients shortly.