Loading ...
Sorry, an error occurred while loading the content.

Colon syntax: We barely knew thee...

Expand Messages
  • n8spam@netscape.net
    ... think ... including ... I seem to be perennially defending disfavored Python proposals, but here goes. I was quite relieved to see Guido s proposed colon
    Message 1 of 27 , Mar 2, 2001
    • 0 Attachment
      --- In python-iter@y..., Guido van Rossum <guido@p...> wrote:
      > [Qrczak]
      > > But I think we may say good-bye to colon proposals and instead
      think
      > > about functions and methods producing appropriate sequences,
      including
      > > sequences of key,value pairs.
      >
      > Agreed.


      I seem to be perennially defending disfavored Python proposals, but
      here goes. I was quite relieved to see Guido's proposed colon syntax.
      It settled the discomfort I felt with the ambiguity of the statement:
      for something in dict:

      The downside? Let's see...

      1. It looks a little strange for those used to current Python syntax.
      This seems to be the most frequently voiced objection, but isn't
      very compelling, considering how often similar objections are
      raised against "print>>" and list comprehensions. The Py-dev
      team doesn't seem to have any qualms about making a few people
      squirm every now and then in the name of progress.

      2. There was a criticism that
      for thing in dict:
      maps to
      for key: in dict:
      while
      for thing in list:
      maps to
      for :item in list:
      This inconsistency is simple to clear up -- disallow
      "for thing in dict:" without a colon. It's ambiguous anyhow, and
      breaks backwards compatability. Note that this is the behavior
      specified in the PEP.

      3. You can't use the same syntax for if statements. So what? For if
      statements you can use has_key() and friends. There's no
      performance to be gained by allowing "if thing in dict" -- it
      only introduces a source of ambiguity.

      4. Is there any criticism 4? Is there something subtle I've missed?


      On the plus side:

      1. The colon syntax is compact. Not a huge concern among Pythoneers,
      but nice when you can get it.

      2. The programmer can disregard keys or values without dummy
      variables. This can boost performance of some mapping objects by
      allowing them to skip the mapping step when only values are
      requested and thus don't have to be in key order. Nice, elegant
      icing on the cake.

      3. It's unambiguous. There's no confusion over whether the loop is
      iterating over keys, values, or both. I consider this the *major*
      advantage of the PEP. I just can't agree that "key" is the logical
      meaning of "thing" in "thing in dict". There's been plenty of
      anecdotal evidence from T. Wouters and others that I'm not the
      only one. I can understand that sometimes a default just has to
      be chosen, but with the colon syntax we have the opportunity to
      use a completely unambiguous construction. Why not take it?


      If we're giving up on key:value, what's the proposed alternative? Why
      is it better?


      Cheers,
      -Nathan

      (My e-mail address at caltech.edu is n8gray)
    • Clark C. Evans
      Hello. I am very new to python and find this discussion interesting as iterators and generic functions were the first thing that I marked as notably absent.
      Message 2 of 27 , Mar 2, 2001
      • 0 Attachment
        Hello. I am very new to python and find this discussion
        interesting as "iterators and generic functions" were the
        first thing that I marked as notably absent. I'd like to
        comment, but I don't really have much of a context so if
        I blunder badly, please forgive me.

        On Sat, 3 Mar 2001 n8spam@... wrote:
        > for thing in dict:
        > for key: in dict:
        > for :item in list:
        >
        > If we're giving up on key:value, what's the proposed
        > alternative? Why is it better?

        This syntax has potential. What if the colon could be
        used for filtering operations as well? For instance,
        in these examples, let _ be a wild card.

        for key:_ in dict: # loops through each key
        for _:value in dict: # loops through each value
        for key:value in dict: # loops thorugh each key and value

        for key:'value' in dict: # loops through keys having value = 'value'

        I don't particularly like ":x" or "x:" since the former
        reminds me of SQL binding variables, and the latter
        reminds me of a subordate clause. However, "_:x" and "x:_"
        make perfect sense to me.

        There is a likely problem: _ is a valid identifier, right?
        Well, could "*:value" and "key:*" be used instead then?

        Sorry if this is _way_ out in left field...

        Kind Regards,

        Clark
      • Clark C. Evans
        Another question, have people considerd tuple syntax? I actually find this more readable than colon syntax. for (key,_) in dict: for (_,value) in dict: for
        Message 3 of 27 , Mar 2, 2001
        • 0 Attachment
          Another question, have people considerd "tuple" syntax?
          I actually find this more readable than colon syntax.

          for (key,_) in dict:
          for (_,value) in dict:
          for (key,value) in dict:

          Once again, sorry if this is way-off base. ;) Clark
        • qrczak@knm.org.pl
          ... for k in dict.keys(): for k,v in dict.items(): but with something more appropriate (using some lazy iteration protocol) instead of keys and items.
          Message 4 of 27 , Mar 3, 2001
          • 0 Attachment
            Sat, 03 Mar 2001 03:29:22 -0000, n8spam@... <n8spam@...> pisze:

            > If we're giving up on key:value, what's the proposed alternative?

            for k in dict.keys():
            for k,v in dict.items():

            but with something more appropriate (using some lazy iteration protocol)
            instead of keys and items. Possibilities I see or have seen:

            1. xkeys(), xitems().
            2. keys, items.
            3. keys(), items().


            Why 1: Doesn't change existing methods or break any existing code.
            Doesn't require any magic to work - easy to understand.
            Consistent with xrange() and xreadlines().

            Why not 1: Introduces an unnecessary duplication of names only to
            enable good performance. A programmer should not have to
            worry whether he should use keys or xkeys, because their
            meaning is essentially the same.


            Why 2: A single attribute can be used as either iterator or list
            producer, with backward-compatible syntax for the latter.

            Why not 2: Attribute syntax plays the role of a method only because of
            backward compatibility. Classes must implement this using
            __getattr__ because it's really a stateful method call.
            Doesn't scale to places when an argument is needed or to
            module-scope functions (e.g. range).

            Why 3: A programmer says what he wants and Python cares for the rest.
            A generic lazy list framework would enable to unify lists and
            iterators anywhere both are needed.

            Why not 3: Requires much internal magic to let such lazy list behave
            in an almost-compatible way with the old regular list - not
            sure if it can be done well at all. Breaks code which
            expected type(dict.keys()) == types.ListType.

            > Why is it better?

            Doesn't need separate "iterate over dictionary" and "iterate over
            sequence" concepts, but splits the conceptually different things:
            - deciding over which sequence to iterate,
            - iteration.

            --
            __("< Marcin Kowalczyk * qrczak@... http://qrczak.ids.net.pl/
            \__/
            ^^ SYGNATURA ZASTĘPCZA
            QRCZAK
          • gzeljko
            From: ... Requires diferent evaluation of dict.keys() in diferent context. Maybe it was motivation for colon sintax, reserved for
            Message 5 of 27 , Mar 3, 2001
            • 0 Attachment
              From: <qrczak@...>
              >
              > Why not 3: Requires much internal magic to let such lazy list behave
              > in an almost-compatible way with the old regular list - not
              > sure if it can be done well at all. Breaks code which
              > expected type(dict.keys()) == types.ListType.
              >

              Requires diferent evaluation of dict.keys() in diferent context.
              Maybe it was motivation for colon sintax, reserved for for-loops.

              ly-y'rs-gzeljko
            • qrczak@knm.org.pl
              ... It does not. It would always be a lazy list, which provides both a sequence protocol and an iteration protocol. The iteration protocol could just reuse
              Message 6 of 27 , Mar 3, 2001
              • 0 Attachment
                Sat, 3 Mar 2001 11:23:31 +0100, gzeljko <gzeljko@...> pisze:

                > > Why not 3: Requires much internal magic to let such lazy list behave
                > > in an almost-compatible way with the old regular list - not
                > > sure if it can be done well at all. Breaks code which
                > > expected type(dict.keys()) == types.ListType.
                >
                > Requires diferent evaluation of dict.keys() in diferent context.

                It does not. It would always be a lazy list, which provides both
                a sequence protocol and an iteration protocol.

                The iteration protocol could just reuse existing slice interface.
                A lazy list has fast seq[0] and seq[1:] calls which produce
                elements on demand.

                Unfortunately the slice interface is slow for normal lists, so it's not
                clear how a for loop could know which is better to use. In all cases
                both will work, but sometimes one is cheaper, and sometimes the other.

                Major problem is the fact that dicts are mutable. In order to preserve
                return-by-value semantics of keys(), any modification of the dict
                would have to be catched and trigger sucking the remaining keys by
                any alive object returned by keys() on that dict. I don't see any big
                problems besides this.

                To say it more concretely, 'for x in seq: statement' would have the
                following semantics for sequences which should better be iterated
                over using the slice protocol:

                _tmp = seq
                while 1:
                try: x = _tmp[0]
                except IndexError: break
                statement
                _tmp = _tmp[1:]

                Implementation of this protocol by various kinds of lazy lists
                (range(), readlines(), keys(), items()) is easy if we ignore the
                mutability problem.

                --
                __("< Marcin Kowalczyk * qrczak@... http://qrczak.ids.net.pl/
                \__/
                ^^ SYGNATURA ZASTĘPCZA
                QRCZAK
              • qrczak@knm.org.pl
                ... Here is a backward compatible proposal. The meaning of for x in seq: statement is as follows: try: _tmp = seq.__iterator__() except AttributeError: _tmp
                Message 7 of 27 , Mar 3, 2001
                • 0 Attachment
                  3 Mar 2001 11:57:53 GMT, Marcin 'Qrczak' Kowalczyk <qrczak@...> pisze:

                  > Unfortunately the slice interface is slow for normal lists, so it's not
                  > clear how a for loop could know which is better to use. In all cases
                  > both will work, but sometimes one is cheaper, and sometimes the other.

                  Here is a backward compatible proposal.
                  The meaning of 'for x in seq: statement' is as follows:

                  try: _tmp = seq.__iterator__()
                  except AttributeError: _tmp = indexing_iterator(seq)
                  # Types with fast x[1:] can just def __iterator__(self): return self
                  # Types which don't define __iterator__ get the old iteration protocol.
                  while 1:
                  try: x = _tmp[0]
                  except IndexError: break
                  statement
                  _tmp = _tmp[1:]

                  indexing_iterator is a proxy which emulates sliced iteration interface
                  in terms of indexed iteration interface.

                  --
                  __("< Marcin Kowalczyk * qrczak@... http://qrczak.ids.net.pl/
                  \__/
                  ^^ SYGNATURA ZASTĘPCZA
                  QRCZAK
                • Ka-Ping Yee
                  ... I can see that this would work, but i don t understand why you prefer while 1: body(iter[0]) iter = iter[1:] to while 1: # when a
                  Message 8 of 27 , Mar 3, 2001
                  • 0 Attachment
                    On 3 Mar 2001 qrczak@... wrote:
                    > try: _tmp = seq.__iterator__()
                    > except AttributeError: _tmp = indexing_iterator(seq)
                    > # Types with fast x[1:] can just def __iterator__(self): return self
                    > # Types which don't define __iterator__ get the old iteration protocol.
                    > while 1:
                    > try: x = _tmp[0]
                    > except IndexError: break
                    > statement
                    > _tmp = _tmp[1:]
                    >
                    > indexing_iterator is a proxy which emulates sliced iteration interface
                    > in terms of indexed iteration interface.

                    I can see that this would work, but i don't understand why you prefer

                    while 1:
                    body(iter[0])
                    iter = iter[1:]

                    to

                    while 1: # when a new-style iterator is available
                    body(iter())

                    or

                    i = 0
                    while 1: # what happens when we use make_iterator()
                    body(iter[i])
                    i = i + 1

                    as the basic stepping operation (in the above, think of "body"
                    as the body of the for-loop and "iter" as the iterator object).


                    -- ?!ng

                    "The biggest cause of trouble in the world today is that the stupid people
                    are so sure about things and the intelligent folk are so full of doubts."
                    -- Bertrand Russell
                  • Ka-Ping Yee
                    ... Binding to tuples already has a well-defined meaning. blah = [(1, 2), (3, 4), (5, 6)] for (a, b) in blah: makes perfect sense, analogous to (a, b) = (1, 2)
                    Message 9 of 27 , Mar 3, 2001
                    • 0 Attachment
                      On Sat, 3 Mar 2001, Clark C. Evans wrote:
                      > Another question, have people considerd "tuple" syntax?
                      > I actually find this more readable than colon syntax.
                      >
                      > for (key,_) in dict:
                      > for (_,value) in dict:
                      > for (key,value) in dict:

                      Binding to tuples already has a well-defined meaning.

                      blah = [(1, 2), (3, 4), (5, 6)]
                      for (a, b) in blah:

                      makes perfect sense, analogous to (a, b) = (1, 2) in Python.

                      Similarly --

                      blah = {1: 2, 3: 4, 5: 6}
                      for key:value in blah:

                      It wouldn't make sense (even aside from the compatibility issue!)
                      to talk about tuples in a context where there aren't any.

                      Hmm, this has been explained before. Perhaps i should add some
                      stuff to the Rationale section of the PEP recounting these common
                      suggestions/objections and their rebuttals.


                      -- ?!ng

                      "The biggest cause of trouble in the world today is that the stupid people
                      are so sure about things and the intelligent folk are so full of doubts."
                      -- Bertrand Russell
                    • gzeljko
                      From: ... a=some_dict.keys() # a going to be list In others words, they must implement complete list interface, 1. without own
                      Message 10 of 27 , Mar 3, 2001
                      • 0 Attachment
                        From: <qrczak@...>
                        > Sat, 3 Mar 2001 11:23:31 +0100, gzeljko <gzeljko@...> pisze:
                        >
                        > > > Why not 3: Requires much internal magic to let such lazy list behave
                        > > > in an almost-compatible way with the old regular list - not
                        > > > sure if it can be done well at all. Breaks code which
                        > > > expected type(dict.keys()) == types.ListType.
                        > >
                        > > Requires diferent evaluation of dict.keys() in diferent context.
                        >
                        > It does not. It would always be a lazy list, which provides both
                        > a sequence protocol and an iteration protocol.

                        a=some_dict.keys() # a going to be 'list'

                        In others words, they 'must' implement complete list interface,

                        1. without own backing store
                        2. with own backing store on demand

                        1. is clear concept, but here is imposible
                        a[i] = something
                        # can't be implemented with old meaning

                        2. is confusing and impractical (IMHO)

                        if-they-was-tuples-ly-y'rs-gzeljko
                      • qrczak@knm.org.pl
                        ... The former uses an already existing interface. The latter is a new interface. The former doesn t mutate the iterator. Because of this a sequence itself can
                        Message 11 of 27 , Mar 3, 2001
                        • 0 Attachment
                          Sat, 3 Mar 2001 04:42:30 -0800 (PST), Ka-Ping Yee <ping@...> pisze:

                          > I can see that this would work, but i don't understand why you prefer
                          >
                          > while 1:
                          > body(iter[0])
                          > iter = iter[1:]
                          >
                          > to
                          >
                          > while 1: # when a new-style iterator is available
                          > body(iter())

                          The former uses an already existing interface. The latter is a new
                          interface.

                          The former doesn't mutate the iterator. Because of this a sequence
                          itself can be its own iterator. For types with fast s[0] and s[1:]
                          s.__iterator__() can just return self instead of producing a new
                          proxy object. The __iterator__() method is stateless - it could
                          even be an attribute instead of a method (but it would introduce
                          a reference cycle).

                          The former is a functional style; the latter is imperative. Lazy
                          lists are immutable by nature. It's much simpler to provide a full
                          immutable sequence protocol than a full mutable sequence protocol.
                          For iteration generic mutability is not needed anyway.


                          I realized that unfortunately the list returned by range() is currently
                          mutable, so it should probably remain mutable for compatibility. In
                          case we want to apply the lazy list framework to plain range() and
                          readlines() retaining their mutability, the picture becomes much more
                          complicated:-(

                          Under these assumptions only, we can equally well define the iteration
                          protocol thus, because iterators are now mutable:

                          try: _tmp = seq.__iterator__()
                          except AttributeError: _tmp = indexing_iterator(seq)
                          while 1:
                          try: x = _tmp[0]
                          except IndexError: break
                          statement
                          del _tmp[:1] # Here is the difference.

                          (The builtin list case can be optimized by avoiding the protocol
                          emulation and proceeding as currently.)

                          Now __iterator__ must always return a new stateful proxy each time
                          it is called. Here is an untested subset of the implementation of
                          mutable lazy range (only unary constructor, no negative indices,
                          many methods skipped):

                          class range:
                          def __init__(self, n):
                          self.items = []
                          self.from = 0
                          self.end = n
                          # The abstract meaning of this object is
                          # self.items + [self.from, self.from+1, ..., self.end-1]

                          def __getitem__(self, i):
                          try: return self.items[i]
                          except IndexError:
                          if i < len(self.items) + self.end - self.from:
                          return self.from - len(self.items)
                          raise IndexError

                          def __setitem__(self, i, x):
                          try: self.items[i] = x
                          except IndexError:
                          if i < len(self.items) + self.end - self.from:
                          # Must materialize the beginning of the numeric part.
                          while i > len(self.items):
                          self.items.append(self.from)
                          self.from += 1
                          # Invariant about the abstract meaning restored.
                          self.items.append(x)
                          self.from += 1
                          raise IndexError

                          def __delslice__(self, i, j):
                          if i >= j: return
                          if j > len(self.items) + self.end - self.from:
                          j = len(self.items) + self.end - self.from
                          if j <= len(self.items):
                          # The range is completely inside the self.items part.
                          del self.items[i:j]
                          elif i <= len(self.items):
                          # The range spans both self.items and the numeric part.
                          self.from += j - len(self.items)
                          del self.items[i:]
                          else:
                          # The range is completely inside the numeric part.
                          # Must materialize the beginning of the numeric part.
                          while i > len(self.items):
                          self.items.append(self.from)
                          self.from += 1
                          # Invariant about the abstract meaning restored.
                          self.from += j - i

                          The immutable range (i.e. xrange) is much simpler.

                          Generally functional style (i.e. immutable) is simpler :-)

                          --
                          __("< Marcin Kowalczyk * qrczak@... http://qrczak.ids.net.pl/
                          \__/
                          ^^ SYGNATURA ZASTĘPCZA
                          QRCZAK
                        • gzeljko
                          From: ... You can think about that in your own terms :) so: dict.keys = stateless lazy list dict.keys() = dict.keys.__call__() - to produce
                          Message 12 of 27 , Mar 3, 2001
                          • 0 Attachment
                            From: <qrczak@...>
                            >
                            > 1. xkeys(), xitems().
                            > 2. keys, items.
                            > 3. keys(), items().
                            >
                            >
                            > Why 1: Doesn't change existing methods or break any existing code.
                            > Doesn't require any magic to work - easy to understand.
                            > Consistent with xrange() and xreadlines().
                            >
                            > Why not 1: Introduces an unnecessary duplication of names only to
                            > enable good performance. A programmer should not have to
                            > worry whether he should use keys or xkeys, because their
                            > meaning is essentially the same.
                            >
                            >
                            > Why 2: A single attribute can be used as either iterator or list
                            > producer, with backward-compatible syntax for the latter.
                            >
                            > Why not 2: Attribute syntax plays the role of a method only because of
                            > backward compatibility. Classes must implement this using
                            > __getattr__ because it's really a stateful method call.
                            > Doesn't scale to places when an argument is needed or to
                            > module-scope functions (e.g. range).
                            >

                            You can think about that in your own terms :)

                            so:

                            dict.keys = stateless lazy list
                            dict.keys() = dict.keys.__call__() - to produce old fashion list

                            __getattr__-don't-needed-ly-y'rs-gzeljko
                          Your message has been successfully submitted and would be delivered to recipients shortly.