Loading ...
Sorry, an error occurred while loading the content.
 

Re: [eiffel-nice-library] We need a spec for CHARACTER.*lower/upper too

Expand Messages
  • Roger Browne
    ... Let s see if we can come up with something straightforward in a short time. If not, I suggest we proceed with STRING for now. We can add a comment to the
    Message 1 of 4 , Mar 6, 2001
      franck@... wrote:

      > I think it's a good idea to define *_lower in terms of CHARACTER.lower, but
      > then we also need the spec for that, so let's do it at the same time...

      Let's see if we can come up with something straightforward in a short
      time.

      If not, I suggest we proceed with STRING for now. We can add a comment to
      the ELKS 2001 STRING specification that the postcondition of 'as_lower'
      in STRING assumes the availability of a suitable CHARACTER feature.

      > As far as I recall ELKS'95 has no case features at all in CHARACTER,
      > but implementations do...

      Here are the current versions.

      ISE:
      lower: CHARACTER
      -- Lowercase value of item
      -- Returns item if not is_upper
      HACT:
      lower: CHARACTER
      -- Lowercase value of `Current'
      -- Returns `Current' if not `is_upper'
      SE:
      to_lower: CHARACTER
      -- Conversion to the corresponding lower case.
      VE:
      to_lower : CHARACTER

      Two are called 'lower', and two are called 'to_lower', but the feature
      name I like is 'as_lower'.

      This would provide naming consistency:

      "is_*" would always denote a BOOLEAN-valued query
      "to_*" would always denote a command
      "as_*" would always denote a query equivalent to applying
      "to_*" to a clone of the current object

      > It's also a good occasion to define lower/upper to operate strictly
      > on a-z/A-Z. I think we should emphasize that the kernel features
      > must not go beyond ascii...

      We broadly agreed this earlier. It's not that it would be undesirable to
      provide for more than just ASCII. It's that it would take much longer,
      and the need seems to be best met by specialized classes/features rather
      than trying to make the core classes/features do every possible job.

      There are many approaches to specifying 'as_lower' in class CHARACTER.
      Here's one:

      06 MARCH 2001 VERSION (class CHARACTER):

      as_lower: CHARACTER
      converted_if_upper:
      ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) > 0
      implies result = ("abcdefghijklmnopqrstuvwxyz").item(
      ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1))
      unchanged_if_not_upper:
      ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) = 0
      implies result = old current

      This reminds me of the discussion we had during November 2000, when we
      removed 'left_adjust' and 'right_adjust' from ELKS 2001 STRING because
      they could easily be implemented outside the STRING class, and were hard
      to specify within the STRING class. We also figured that white-space
      testing should be done outside the CHARACTER class, because the
      definition white-space may be different for different applications.
      Following that discussion, Christian Couder posted some sample mixin
      classes to this list in November 2000.

      Along the same lines, it could be argued that all the upper/lower
      features from STRING and CHARACTER should be externalized - but I guess
      that the existing upper/lower features have been much more widely used
      than 'left/right_adjust'.

      Regards,
      Roger
      --
      Roger Browne - roger@... - Everything Eiffel
      19 Eden Park Lancaster LA1 4SJ UK - Phone +44 1524 32428
    • Arno Wagner
      ... I think we should stick to this at (almost) all cost. In this way the names document themselves. ... Yes. After all these are _core_ classes. And the basic
      Message 2 of 4 , Mar 6, 2001
        On Tue, Mar 06, 2001 at 01:58:14PM +0000, Roger Browne wrote:
        > franck@... wrote:
        >
        > > I think it's a good idea to define *_lower in terms of CHARACTER.lower, but
        > > then we also need the spec for that, so let's do it at the same time...
        >
        > Let's see if we can come up with something straightforward in a short
        > time.
        >
        > If not, I suggest we proceed with STRING for now. We can add a comment to
        > the ELKS 2001 STRING specification that the postcondition of 'as_lower'
        > in STRING assumes the availability of a suitable CHARACTER feature.
        >
        > > As far as I recall ELKS'95 has no case features at all in CHARACTER,
        > > but implementations do...
        >
        > Here are the current versions.
        >
        > ISE:
        > lower: CHARACTER
        > -- Lowercase value of item
        > -- Returns item if not is_upper
        > HACT:
        > lower: CHARACTER
        > -- Lowercase value of `Current'
        > -- Returns `Current' if not `is_upper'
        > SE:
        > to_lower: CHARACTER
        > -- Conversion to the corresponding lower case.
        > VE:
        > to_lower : CHARACTER
        >
        > Two are called 'lower', and two are called 'to_lower', but the feature
        > name I like is 'as_lower'.
        >
        > This would provide naming consistency:
        >
        > "is_*" would always denote a BOOLEAN-valued query
        > "to_*" would always denote a command
        > "as_*" would always denote a query equivalent to applying
        > "to_*" to a clone of the current object
        >

        I think we should stick to this at (almost) all cost. In this
        way the names document themselves.

        > > It's also a good occasion to define lower/upper to operate strictly
        > > on a-z/A-Z. I think we should emphasize that the kernel features
        > > must not go beyond ascii...
        >
        > We broadly agreed this earlier. It's not that it would be undesirable to
        > provide for more than just ASCII. It's that it would take much longer,
        > and the need seems to be best met by specialized classes/features rather
        > than trying to make the core classes/features do every possible job.

        Yes. After all these are _core_ classes. And the basic STRING/CHARACTER
        classes are not necessarily what is used in the user interface.
        What I would not like at all would be something like 'unicode
        everywhere'. It's not needed and complicated matters uneccessarily.
        (Ascii) string processing is an important functionality for
        sytem or network things e.g. in unix. Just think of processing
        the output of a command, or parsing protocol statements in e.g.
        SMTP.

        >
        > There are many approaches to specifying 'as_lower' in class CHARACTER.
        > Here's one:
        >
        > 06 MARCH 2001 VERSION (class CHARACTER):
        >
        > as_lower: CHARACTER
        > converted_if_upper:
        > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) > 0
        > implies result = ("abcdefghijklmnopqrstuvwxyz").item(
        > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1))
        > unchanged_if_not_upper:
        > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) = 0
        > implies result = old current

        I like this formulation. I think that explicitely naming the list
        of characters to be converted with their conversion results is the
        most transparent and clear way to do it.
        [...]


        > Along the same lines, it could be argued that all the upper/lower
        > features from STRING and CHARACTER should be externalized - but I guess
        > that the existing upper/lower features have been much more widely used
        > than 'left/right_adjust'.

        Would be my assessment as well. IMO 'left/right_adjust' is mainly
        needed when dealing with "untrusted" input functions.
        A little like the Perl operation 'chomp', that removes a trailing
        CR/LF if it is there. But even in Perl things like 'left/right_adjust'
        are done manually (something like '$a =~ s/^(\s*)(.*)(\s*)$/$2/;') and
        don't have a hardcoded implementation.

        Regards,
        Arno

        --
        Arno Wagner Dipl. Inform. ETH Zuerich wagner@...
        GnuPG: ID: F0C049F1 FP: 8C E0 6F A5 CC B1 5A 11 ED C7 AD D2 05 5E BB 6F
        "What I saw in the Xerox PARC technology was the caveman interface, you point
        and you grunt. A massive winding down, regressing away from language, in
        order to address the technological nervousness of the user. Users wanted to
        be infantilized, to return to a pre-linguistic condition in the using of
        computers, and the Xerox PARC technology's primary advantage was that it
        allowed users to address computers in a pre-linguistic way. This was to my
        mind a terribly socially retrograde thing to do, and I have not changed my
        mind about that." Eben Moglen (http://old.law.columbia.edu for more by E.M.)
      • Roger Browne
        ... This version can probably be improved by using has rather than index_of on the left of implies : 06 MARCH 2001 19:30 VERSION (class CHARACTER):
        Message 3 of 4 , Mar 6, 2001
          I wrote:
          > > 06 MARCH 2001 VERSION (class CHARACTER):
          > >
          > > as_lower: CHARACTER
          > > converted_if_upper:
          > > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) > 0
          > > implies result = ("abcdefghijklmnopqrstuvwxyz").item(
          > > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1))
          > > unchanged_if_not_upper:
          > > ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1) = 0
          > > implies result = old current

          Arno Wagner wrote:
          > I like this formulation. I think that explicitely naming the list
          > of characters to be converted with their conversion results is the
          > most transparent and clear way to do it.

          This version can probably be improved by using 'has' rather than
          'index_of' on the left of "implies":

          06 MARCH 2001 19:30 VERSION (class CHARACTER):

          as_lower: CHARACTER
          -- Lower case equivalent.
          converted:
          ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").has(old current)
          implies result = ("abcdefghijklmnopqrstuvwxyz").item(
          ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").index_of(old current, 1))
          unchanged:
          not ("ABCDEFGHIJKLMNOPQRSTUVWXYZ").has(old current)
          implies result = old current

          I also added a header comment and abbreviated the tags.

          Regards,
          Roger
          --
          Roger Browne - roger@... - Everything Eiffel
          19 Eden Park Lancaster LA1 4SJ UK - Phone +44 1524 32428
        Your message has been successfully submitted and would be delivered to recipients shortly.