Loading ...
Sorry, an error occurred while loading the content.
 

Re: [ISO8601] Re: Clarifications: 5.2.2.2

Expand Messages
  • P A Hill & E V Goodall
    ... This is probably why the note which looks unobvious to me is obvious to you. I read 4.6 with the most important opening phrase By mutual agreement of the
    Message 1 of 15 , Jul 14, 2001
      g1smd@... wrote:
      > Section 4.6 refers. 'These leading hyphens may be omitted in
      > the applications where there is no risk of confusing these
      > representations with others'. Any leading hyphen would always
      > be to replace a missing element. Hyphens between elements are
      > separators in Extended Formats. There are no separators in the
      > Basic formats. There are never any separators before the first
      > element, only 'replacement' hyphens for missing elements.

      This is probably why the note which looks unobvious to me is
      obvious to you. I read 4.6 with the most important opening
      phrase "By mutual agreement of the partners in information interchange"
      Thus, it provides a way in specific applications of this standard to drop
      something this is stated in the standard. I don't see 4.6 as suggesting that it
      is the rationale which was used to come up with the formats which are in standard.

      > It's all in
      > paragraph 4.6 as far as I can see.

      It says you can drop what is there, but it doesn't say that the full representation
      would treat the hundreds part of the year as a separate component from
      the tens and ones of the year. Only a few examples hint at that, but not
      all of them do and not all exceptions are noted.

      > Does this still appear
      > in the published ISO 8601:2000?

      I wouldn't know, I don't have it, I just have the various free downloads.
      Hopefully this was clear. If not it should be by now.

      > > -YY appears in truncated calendar dates, i.e. -YY-MM-DD,
      > > see examples of section 5.2.1.3.
      >
      > You have misquoted the standard.

      Sorry, my mistake. It should have read "-YY-MM".

      > In all these formats: -YYMM and -YY-MM, and -YY, the hyphen
      > does replace the missing two digits of the 'century'. In
      > YY-MM-DD it has been left out, as per para 4.6.

      Again, that is not what a pedantic read of 4.6 says. 4.6 says me and
      who ever I communicate with can cut what we see even further when we are
      only using some agreed upon subset of everything, it doesn't
      provide a rationale for what is in the standard.

      > > 5.2.2.2 (a) YY-DDD <-- no dash for missing numeric century
      >
      > Yes, because in 05-005, the '005' has to be the Day of the Year,
      > there is absolutely no other possibility. Therefore, the element
      > before that has to be a two-digit year. The leading hyphen can
      > therefore be omitted, exactly as per the YY-MM-DD example, above.
      > That is, -YY-DDD is unecessary, YY-DDD will suffice.

      It is too bad that 4.6 doesn't actually introduce the idea that
      the writers of the standard used the idea as you
      claim, to come up with their various formats.

      Maybe, some discussion at 5.2.1 ... Year would actually set at least
      me in the right mind set.

      Also, if your suggestion as to the design is the case, I would expect a note
      like the one I was surprised to see in 5.2.2.2 after 5.2.1.3 (a)
      and the note at 5.2.3.3 noting all of the variations which are
      not "fully hyphenated". This would make them all consistent.
      The note at 5.2.3.3 would not list only
      one format, but mention all of those which one might think might have
      a leading dash for missing 'century' and another for missing year pointing
      out the simplification.

      In fact, the standard before the first truncated format in the opening paragraph
      of 5.2.1.3 says "In each case hyphens that indicate components should be used only
      as indicated or shall be omitted."

      That to me hints that some of choices are arbitrary, so don't play around with
      them. Also, there is no place that states that all formats are mutually
      unambiguous from each other. As I was reading I was looking for just such
      a statement or examples that violated the idea. I found neither, but that
      is no proof.

      Hopefully this can all be clarified in the next edition.

      > OVER TO YOU! There are over 100 people out there reading this.
      > What say all of you? Am I right? Dan Kohn? Fred Bone? Pete
      > Forman? Aron Roberts?

      The question is not whether you are right, it is a question of
      meaning of the standard. Or another way to put it: You may be right,
      and I have no reason to think you aren't, but the standard is still
      not clear on where the "should" comes from.

      I personally am now convinced by what you have provided that
      I was misled by the facts that the very first truncated example
      YYMMDD doesn't have a leading dash, doesn't have a note which
      points this out and there is nothing up to that point which says
      what the expected style is, and there is no consistency in the following
      examples, so when I finally get to 5.2.2.2 "note: ... should be ..."
      I go back and read looking for something that tells me what should
      be anywhere and all I see are various examples without explanation
      that any are exceptions to any expectations. Thus reading the
      standard does not make me think there is any "should" involved other
      than the examples as given. That is the source of my question
      about this note.

      > You actually said:
      >
      > >>>> Also, were to where or whom do I send simple typo corrections?

      Gee, this is a useful comment, who wasn't going to discuss each
      word and comma? :-(

      > > Anyone know the USA rep, or the list of IEEE committee members?
      >
      > The ANSI or NIST Web Sites should direct you to that information.

      Yes of course it should, (not that there is an ISO standard defining that
      it should! :-) I already tried that and didn't find it.

      > Try that one as well. What is there to lose? If they can't
      > help you then I would hope they would point you towards someone
      > who can. Maybe one of the 100-plus people receiving this email
      > has a better suggestion?

      Any suggestion in my opinion is probably better than just
      pointing out that there are some e-mail addresses. I was actually
      hoping for something useful not just the obvious. I have noted
      your comments with regard to Louis Visser.

      -Paul
    • Pete Forman
      ... How about the last paragraph in 4.1. I agree that the choices seem arbitrary. There may be some logic behind it though. My guess is that the rules are
      Message 2 of 15 , Jul 16, 2001
        P A Hill & E V Goodall writes:
        > [snip]
        > In fact, the standard before the first truncated format in the
        > opening paragraph of 5.2.1.3 says "In each case hyphens that
        > indicate components should be used only as indicated or shall be
        > omitted."
        >
        > That to me hints that some of choices are arbitrary, so don't play
        > around with them. Also, there is no place that states that all
        > formats are mutually unambiguous from each other. As I was reading
        > I was looking for just such a statement or examples that violated
        > the idea. I found neither, but that is no proof.

        How about the last paragraph in 4.1.


        I agree that the choices seem arbitrary. There may be some logic
        behind it though. My guess is that the rules are something like

        Replace an omitted component in a truncated format with a hyphen.
        Component may be century (first two digits of a four digit year)
        or decade (first three digits of a four digit year) or last two
        digits of the year or month or week. The term component is not
        defined as such but the components are listed for each of the
        truncated representations.

        Remove hyphens if the result is unambiguous.
        4.6 para 1 states that a hyphen may be necessary to represent an
        omitted component. That implies to me that the hyphen should not
        be used if possible.

        (The above two rules may also be expressed as: Components may be
        omitted, if the result is ambiguous then use a hyphen to stand for
        the omitted component.)

        Add hyphens where a format is ambiguous. This is bound to be
        arbitrary: if two formats collide one must be chosen to get the
        extra hyphen.


        Note that omitting hyphens by these rules is a separate issue to
        5.2.1.3. I take the latter to mean that the communicating parties
        agree that, for example, two digits alone mean a month rather than
        using four characters of 5.2.1.3.e proper.
        --
        Pete Forman -./\.- Disclaimer: This post is originated
        WesternGeco -./\.- by myself and does not represent
        pete.forman@... -./\.- opinion of Schlumberger, Baker
        http://www.crosswinds.net/~petef -./\.- Hughes or their divisions.
      • Douglas_Luthanen@Cargill.com
        How may I be deleted from this mailing list? Respectfully, dwl Doug Luthanen 501 750-6840 ... From: goodhill@xmission.com [mailto:goodhill@xmission.com] Sent:
        Message 3 of 15 , Jul 16, 2001
          How may I be deleted from this mailing list?
          Respectfully,

          dwl
          Doug Luthanen
          501 750-6840


          -----Original Message-----
          From: goodhill@... [mailto:goodhill@...]
          Sent: Thursday, July 12, 2001 1:50 PM
          To: iso8601@yahoogroups.com
          Subject: [ISO8601] Clarifications: 5.2.2.2


          To test this group, and my ability to post to I will ask a simple
          question.
          I was reading ISO/TC 154 N 362
          ISO8601:2000(E)2000-12-19

          Ordinal Date
          Section 5.2.2.2 Truncated Version

          <quote>
          Day only in the implied year
          Basic format: -DDD EXAMPLE -102
          Extended format: not applicable

          NOTE Logically, the representation should be [--DDD], but the first
          hyphen is
          superfluous and, therefore, it has been omitted.
          </quote>

          If I understand this correctly, they are referring to the "not
          applicable" Extended format which
          the standard has left out. The first - would be the missing (i.e.
          truncated) year the
          second would be the separator
          dash between the year and the day.

          Do I understand the intent of the dashes and why "the representation
          should be [--DDD]"?

          Also, were to where or whom do I send simple typo corrections?

          -Paul




          Your use of Yahoo! Groups is subject to
          http://docs.yahoo.com/info/terms/
        • g1smd@amsat.org
          On 2001-Jul-14 Paul Hill wrote: [2001-Jul-16] These comments are concerning the text of ISO/TC 154 N 362 [PDF] document, which is
          Message 4 of 15 , Jul 16, 2001
            On 2001-Jul-14 Paul Hill <goodhill@...> wrote:


            [2001-Jul-16]



            These comments are concerning the text of ISO/TC 154 N 362 [PDF]
            document, which is the final draft version of the ISO 8601:2000
            standard. The final published version of ISO 8601:2000 is still
            not available online, but this draft (from only four days
            previously) can still be downloaded from [PDF00005.PDF]:
            <http://lists.ebxml.org/archives/ebxml-core/200104/msg00252.html>.



            >> Section 4.6 refers. 'These leading hyphens may be omitted in
            >> the applications where there is no risk of confusing these
            >> representations with others'. Any leading hyphen would always
            >> be to replace a missing element. Hyphens between elements are
            >> separators in Extended Formats. There are no separators in the
            >> Basic formats. There are never any separators before the first
            >> element, only 'replacement' hyphens for missing elements.

            > This is probably why the note which looks unobvious to me is
            > obvious to you. I read 4.6 with the most important opening
            > phrase "By mutual agreement of the partners in information
            > interchange" Thus, it provides a way in specific applications
            > of this standard to drop something this is stated in the
            > standard. I don't see 4.6 as suggesting that it is the
            > rationale which was used to come up with the formats which
            > are in standard.

            I am very familiar with the various allowed formats, and
            variations, and being familiar with all of that, it is very
            easy for me to miss a point or hint in the wording, or to know
            something which although it isn't actually stated in the standard,
            is actually the way things are done. I do see that you have a
            point here. They state that the leading zeroes *may* be dropped,
            then go on to just automatically drop them, in some of the
            examples, without putting a note against some of them. That is
            not very good. Each and every time they do this, it does need an
            extra note or clarification. However, is para 4.9 perhaps a
            poorly worded way of trying to tell us about this?



            >> It's all in paragraph 4.6 as far as I can see.

            > It says you can drop what is there, but it doesn't say that
            > the full representation would treat the hundreds part of the
            > year as a separate component from the tens and ones of the
            > year. Only a few examples hint at that, but not all of them
            > do and not all exceptions are noted.

            I agree that this is a bit sloppy. It needs rewriting, or some
            additional notes. The minimum required, would be to state that
            the year may be specified by two, or by four, or by more
            digits; and I see a problem here... 121212 is assumed to be
            YYMMDD, but this could be the YYYYYY 121212. Having re-read the
            standard I see that para 4.7 does cover this. Additionally,
            para 4.8 does say that elements do all have a defined length,
            and that leading zeroes must be used to fulfil this. It
            doesn't fully answer your point, so I guess their wording
            needs improving.



            >> Does this still appear in the published ISO 8601:2000?

            > I wouldn't know, I don't have it, I just have the various
            > free downloads. Hopefully this was clear. If not it should
            > be by now.

            I am still waiting for someone on this email list to compile
            a note of all of the changes between the 2000-Dec-19 draft
            and the 2001-Jan-24 final published version of ISO 8601.
            Any volunteers?



            >>> -YY appears in truncated calendar dates, i.e. -YY-MM-DD,
            >>> see examples of section 5.2.1.3.

            >> You have misquoted the standard.

            > Sorry, my mistake. It should have read "-YY-MM".

            No problems. Typos happen, but it did hinder your argument a
            bit. Here, the first hyphen is replacing the 'old' 'CC' and
            the second hyphen is the separator. The Basic format of this
            is -YYMM, where the hyphen again replaces the 'old' 'CC'.
            In 'YY-MM-DD' ISO automatically dropped the leading hyphen,
            as a date like '12-12-12' cannot possibly be anything other
            than 'YY-MM-DD'. Is this hinted at in 4.9?



            >> In all these formats: -YYMM and -YY-MM, and -YY, the hyphen
            >> does replace the missing two digits of the 'century'. In
            >> YY-MM-DD it has been left out, as per para 4.6.

            > Again, that is not what a pedantic read of 4.6 says. 4.6 says
            > me and who ever I communicate with can cut what we see even
            > further when we are only using some agreed upon subset of
            > everything, it doesn't provide a rationale for what is in the
            > standard.

            As above, I do agree that the wording in 4.6 provides a reason
            for this, but does not provide a complete rationale of why this
            is done. I hope ISO clarifies it in the next edition. I do now
            see what your point is; that you *may* drop the hyphen, but
            for some reason, with YY-MM-DD and YYMMDD, ISO have *already*
            dropped it, without clearly saying why. However, is para 4.9
            perhaps a poorly worded way of trying to tell us about this?



            >>> 5.2.2.2 (a) YY-DDD <-- no dash for missing numeric century

            >> Yes, because in 05-005, the '005' has to be the Day of the
            >> Year, there is absolutely no other possibility. Therefore,
            >> the element before that has to be a two-digit year. The
            >> leading hyphen can therefore be omitted, exactly as per the
            >> YY-MM-DD example, above. That is, -YY-DDD is unnecessary,
            >> YY-DDD will suffice.

            > It is too bad that 4.6 doesn't actually introduce the idea that
            > the writers of the standard used the idea as you claim, to come
            > up with their various formats.

            Extra notes would be useful; but I also think there is a 'logic
            error' or 'precedence error' in providing some of the default
            formats. I'll explain more at the end of this message. I've
            hinted at it with the note about YYMMDD and YYYYYY above; but
            it also concerns YYYYMM.



            > Maybe, some discussion at 5.2.1 ... Year would actually set at
            > least me in the right mind set.

            > Also, if your suggestion as to the design is the case, I would
            > expect a note like the one I was surprised to see in 5.2.2.2
            > after 5.2.1.3 (a) and the note at 5.2.3.3 noting all of the
            > variations which are not "fully hyphenated". This would make
            > them all consistent.

            > The note at 5.2.3.3 would not list only one format, but mention
            > all of those which one might think might have a leading dash
            > for missing 'century' and another for missing year pointing
            > out the simplification.

            Now I see what you are saying, I agree that the wording here
            is sub-optimal. You reach a place where you see a format you
            were not expecting, with no previous rationale as to why the
            format is shown like it is. Yes, the standard is deficient
            (unless 4.9 is where its at?) and requires extra notes.



            > In fact, the standard before the first truncated format in the
            > opening paragraph of 5.2.1.3 says "In each case hyphens that
            > indicate components should be used only as indicated or shall
            > be omitted."

            That paragraph, along with 4.6, and now seeing that several notes
            against examples are obviously missing, when all read together
            do raise some doubts. I agree that the wording is poor.



            > That to me hints that some of choices are arbitrary, so don't
            > play around with them. Also, there is no place that states
            > that all formats are mutually unambiguous from each other.

            I knew that last statement to be true (mutually unambiguous),
            but it takes a bit of tracking down to find those words in the
            very last part of 4.1 '... unique and unambiguous'; but is
            that wording as strong as '*mutually* unambiguous'? I think
            that it probably is.

            I have never considered any of the defined formats to be
            'arbitrary' choices, but now that I have condensed all of the
            standard down to a short table, further on, that table does
            appear to show that to be the case for several of the formats,
            most notably with YYMMDD.



            > As I was reading I was looking for just such a statement or
            > examples that violated the idea. I found neither, but that
            > is no proof.

            I now see what you are trying to say; and I think I have found
            one... YYMMDD vs YYYYYY (excepting the note in 4.7). Also, I
            think that YYMMDD should have been disallowed in favour of
            YYYYMM, which is not currently permitted. See the table below.



            > Hopefully this can all be clarified in the next edition.

            >> OVER TO YOU! There are over 100 people out there reading this.
            >> What say all of you? Am I right? Dan Kohn? Fred Bone? Pete
            >> Forman? Aron Roberts?

            > The question is not whether you are right, it is a question of
            > meaning of the standard. Or another way to put it: You may be
            > right, and I have no reason to think you aren't, but the
            > standard is still not clear on where the "should" comes from.

            I wasn't sure why you were hung up on this one word 'should'.
            Now you have explained more, then I am happy to agree with you.
            You are right. Although the standard works the way I have said,
            and the examples follow the method I have stated, nowhere in
            the standard does it state clearly that this is the case, or
            why it should be so, and several notes of clarification are
            obviously missing on a few examples. It takes someone else
            reading it 'fresh' to spot these errors. I am too 'familiar'
            with how it works to pick up a fundamental error like that.
            If you know how something already works, then it isn't always
            readily obvious that some little note or clarification is
            actually missing. There are 'hints' in 4.6 and 4.9, but not
            enough to satisfy, now that I have read it about 6 times.



            > I personally am now convinced by what you have provided that
            > I was misled by the facts that the very first truncated example
            > YYMMDD doesn't have a leading dash, doesn't have a note which
            > points this out and there is nothing up to that point which
            > says what the expected style is, and there is no consistency
            > in the following examples, so when I finally get to 5.2.2.2
            > "note: ... should be ..." I go back and read looking for
            > something that tells me what should be anywhere and all I see
            > are various examples without explanation that any are
            > exceptions to any expectations. Thus reading the standard
            > does not make me think there is any "should" involved other
            > than the examples as given. That is the source of my question
            > about this note.

            Fully understood. Now you have explained it, I am tending to
            agree with you. Your point was not that there was an error in
            the format they were suggesting to use, but that there was no
            notes to explain why it should be formatted that way, when you
            were actually expecting to see something else there, after
            following the 'logic' of the previous few paragraphs. Para
            4.9 may be referring, but it isn't obvious or clearly worded.
            I keep saying *may* because I don't really know if it is, or
            if it isn't.



            I have checked the next part of this message very thoroughly
            but due to the huge complexity in compiling it, I cannot
            guarantee that I caught all of the initial typing errors.



            In the next part, I have used a Year of 1212, Month of 12,
            and Day of 12, so that you are not influenced by digits like
            '99' seemingly assuring you these are the last two digits of
            a year... when in fact two digits stated on their own are
            actually for the FIRST two digits of the year (unless by
            'mutual agreement', etc).



            This next part needs to be viewed using a NON-proportional
            typeface, so that it aligns in columns. Copy and paste to a
            Word Processor, if necessary, in order to achieve this.



            There is an inconsistency in the standard, which becomes
            obvious when I list the allowed and non-allowed formats in
            a table, like this (non-allowed are marked with x here;
            'allowed only by mutual agreement' are marked with z; and
            the '!' marking means 'see notes'; read the left and right
            half separately):


            YYYY-MM-DD YYYY-MM-DD YYYYMMDD YYYYMMDD
            ---------- ---------- -------- --------

            a YY 12 YY 12 a
            b YYYY 1212 YYYY 1212 b
            c YYYY-MM 1212-12 x! YYYYMM 121212 c
            d YYYY-MM-DD 1212-12-12 YYYYMMDD 12121212 d
            e -YY -12 -YY -12 e
            f z YY 12 z YY 12 f
            g -YY-MM -12-12 -YYMM -1212 g
            h z YY-MM 12-12 z YYMM 1212 h
            i x! -YY-MM-DD -12-12-12 x! -YYMMDD -121212 i
            j ! YY-MM-DD 12-12-12 ! YYMMDD 121212 j
            k --MM --12 --MM --12 k
            l x -MM -12 x -MM -12 l
            m z MM 12 z MM 12 m
            n --MM-DD --12-12 --MMDD --1212 n
            o x -MM-DD -12-12 x -MMDD -1212 o
            p z MM-DD 12-12 z MMDD 1212 p
            q ---DD ---12 ---DD ---12 q
            r x --DD --12 x --DD --12 r
            s x -DD -12 x -DD -12 s
            t z DD 12 z DD 12 t

            An additional note must also state that leading hyphens only
            replace elements, and are never separators, for this to work.
            Also note that one digit elements, and three, five and seven
            digit formats are not allowed (unless, I suppose 'by mutual
            agreement...'). In any case, a one digit element would always
            have to be the most left hand element in the expression (e.g.
            YYYMM, YMM, YMMDD) except for showing a decade ('197' for
            the 1970s?), but these are 'horrible' structures, the latter
            especially possibly being almost outside the scope of the
            ISO 8601 standard.


            Notes and Explanations:
            -----------------------

            LEFT: RIGHT:

            Not allowed (x left), because c - In my opinion, this format
            (except z by mutual agreement): should be allowed, not j !
            f - used by a, confuse with m t f - used by a, confuse with m t
            h - could be confused with p h - used by b, confuse with p
            i - logical, but ISO use j i - logical, but ISO use j
            j - allowed!! In my opinion, j - allowed!! In my opinion
            it should not be allowed. it should not be allowed.
            l - used by e, confuse with s l - used by e, confuse with s
            m - used by a, confuse with f t m - used by a, confuse with f t
            o - format used by g o - format used by g
            p - could be confused with h p - used by b, confuse with h
            r - format used by k r - format used by k
            s - used by e, confuse with l s - used by e, confuse with l
            t - used by a, confuse with f m t - used by a, confuse with f m


            I say 'allowed!!' against the LEFT 'j' format only because
            it is the only 'right-justified' Extended date format that
            is allowed (without a leading hyphen to show the missing
            elements), and it is out of place. On the RIGHT side,
            YYMMDD (j), which is also allowed, will therefore conflict
            with YYYYYY. In my opinion, usage of YYMMDD in the standard
            is NOT correct. They should have allowed the YYYYMM format
            instead. It is strange that the YYYYMM (RIGHT c) date format
            is not allowed, as it totally breaks the logic of the table,
            if you are looking for a pattern. The pattern to me is that
            dates are 'left justified', with hyphens in place of missing
            left elements, one hyphen per two digits omitted (which is a
            definition that neatly avoids having to use a word like
            'century'; as long as it is also stated that the year can be
            two or four digits; or more digits, just as long as only two
            at a time are added), and that reduced precision simply
            deletes digits two at a time from the right of the date.

            Now, looking at the standard condensed to this table, it is
            obvious that the problems also occur when there is a format
            that is allowed in the left column, but disallowed in the
            right column (e.g. c), as the standard does not provide
            enough information to support why this is done. To me that
            is an error.

            Also, the 'mutual agreement' problem appears here again.
            Representations that have the prescribed leading hyphens
            omitted can be used only by mutual agreement... except
            that the format at j seems to be the default, rather than
            the format stated in i. That is, for all the others, mutual
            agreement is required, but for j it has already been
            forced upon us to agree to this. In doing this, you get the
            'logic error' with the RIGHT c entry being disallowed (in
            order to satisfy the {non written, as far as I can see} rule
            that any representation can have only one implied meaning
            unless mutual agreement has already been obtained).

            I guess they had to include YYMMDD and exclude YYYYMM simply
            because millions of computer systems were already using YYMMDD.
            However that probably helped people to avoid thinking about
            Y2K problems for far longer than they should have done.
            1988 would actually have been early enough for every version
            of Windows (3.x onwards) to be completely free of all such
            problems, for example.



            This next part needs to be viewed using a NON-proportional
            typeface, so that it aligns in columns.



            The table can be rearranged to ask what a numerical format
            should be decoded as. To keep it simple, I have not divided
            it into Basic and Extended formats. Anything with a hyphen
            between elements is an Extended format. Writing the table
            this way, I have included some formats that the ISO standard
            says are 'Not Applicable'. There cannot be a way to tell if
            '1950' is supposed to be a Basic format Year or an Extended
            format Year. I have ignored this and included it under both
            styles. The table produces (again, a, b, etc, refer to notes
            after) the following result:


            ALLOWED DISALLOWED
            ------- ----------

            1212-12-12 YYYY-MM-DD
            1212-12 YYYY-MM
            1212 YYYY z YYMM MMDD
            12 YY (19 of 1950) z YY (50 of 1950) MM DD

            12121212 YYYYMMDD
            121212 a YYMMDD ! x! YYYYMM !
            1212 YYYY z YYMM MMDD
            12 YY (19 of 1950) z YY (50 of 1950) MM DD

            1212-12-12 YYYY-MM-DD
            12-12-12 b YY-MM-DD !
            12-12 c n/a z YY-MM MM-DD !
            12 YY (19 of 1950) z YY (50 of 1950) MM DD

            -12-12-12 d n/a ! x! -YY-MM-DD (use YY-MM-DD) !
            -12-12 -YY-MM x -MM-DD
            -12 -YY (50 of 1950) x -MM -DD

            -121212 e n/a ! x! -YYMMDD ! (use YYMMDD) !
            -1212 -YYMM x -MMDD
            -12 -YY (50 of 1950) x -MM -DD

            --12-12 --MM-DD
            --12 --MM x --DD

            --1212 --MMDD
            --12 --MM x --DD

            ---12 ---DD

            12-1212 n/a x Not allowed at all.


            Notes and Explanations:
            -----------------------

            At a, I wish that 121212 were really YYYYMM, not YYMMDD.

            People use b, but logically the full date is -12-12-12.
            It would be useful if both b and YYMMDD were disallowed, or
            at least reverted to the 'use by mutual agreement' status.

            At c, I am glad that both are not valid, but logically since
            YY-MM-DD was allowed at b, then 12-12 would have to be MM-DD
            (both would then be 'right justified'). However these would be
            the only two 'right justified' dates in the table. Everything
            else is 'left justified, with a hyphen to replace each missing
            pair of digits'. So really it is b that breaks this unwritten
            rule. I wonder if ISO realise what they have done?

            At d and e, these are the formats you would logically expect
            to see being used, but ISO just automatically dropped the
            leading hyphen, producing the formats at a and b instead.
            By breaking the 'pattern' you are correct to say that some
            choices of format now appear to be arbitrary.

            And, z, just confirms the old 'by mutual agreement you can
            drop leading hyphens for any of these date formats' rule,
            and use all the meanings that I have currently placed in the
            'disallowed' column if you need to do so.

            If a line has an 'x', then entries in the 'disallowed' column
            to the right of the 'x' are never allowed to be used.

            For anything marked 'n/a !' in the allowed column, the '!'
            points to surprise at the blank; that ISO have disallowed
            what you would logically expect to be there.

            A date format in the allowed column, with a '!' included, you
            would logically expect to be disallowed, but ISO included it.

            Having allowed YYMMDD in, you would expect YY on its own to
            simply be the 50 of 1950, but it actually represents the 19
            part.

            Logically, both of the entries at ALLOWED 'a' and ALLOWED 'b'
            should really be 'by mutual agreement' formats, but ISO chose
            not to do this.

            From this, it is now obvious that ISO have 'jumped the gun'
            by automatically deleting leading hyphens on some formats
            that would be logically expected to have one or more leading
            hyphens, and have not provided a note to say that this is
            what has happened (unless 4.6 and 4.9 is the hint).



            Another part of the problem is that the date is written as
            4-2-2. If there were a separator between the 'Century' and
            the two digit Year, then semantic rules would be a lot easier.

            Today would be 20-01-07-15. So, leaving off left elements
            would give -01-07-15, --07-15, ---15, and leaving off right
            elements would give 20-01-07, 20-01, and 20. It's the fact
            that the Year is four digits, and the other elements are
            only two digits that skews the pattern; as well as the
            inclusion of YYMMDD in the default standard, rather then
            YYYYMM. For Basic format dates read 20010715, -010715,
            --0715, ---15, 200107, 2001, and 20 noting that here 200107
            is YYYYMM, because YYMMDD is actually done by using -YYMMDD.
            That IS consistent. ISO Ver 3?



            Another way of solving the problem, would be to set up a
            standard where the date is a full 8 digits, and has optional
            separators, but if any pair of digits on either the left or
            right end of the date are missing then they are replaced,
            each pair of missing digits, with a hyphen (whereas ISO 8601
            only ever places hyphens on the left side of a date), as
            well as there being a rule to say that there are no leading
            or trailing separators, only hyphens used as replacements.
            In other words, separators are only ever used in order to
            separate digits.


            This would produce something like:


            BASIC FORMAT EXTENDED FORMAT
            ------------ ---------------

            12121212 YYYYMMDD 1212-12-12 YYYY-MM-DD
            121212- YYYYMM- 1212-12- YYYY-MM-
            1212-- YYYY-- 1212-- YYYY--
            12--- YY--- 12--- YY---
            -121212 -YYMMDD -12-12-12 -YY-MM-DD
            -1212- -YYMM- -12-12- -YY-MM-
            -12-- -YY-- -12-- -YY--
            --1212 --MMDD --12-12 --MM-DD
            --12- --MM- --12- --MM-
            ---12 ---DD ---12 ---DD


            You can always check the date is correctly formed, by using
            a very simple set of rules... Ignore all hyphens BETWEEN
            digits. Group all digits into pairs. Count each leading
            hyphen, each digit pair, and each trailing hyphen. There
            should always be exactly FOUR units.

            A further rule would state that in order to join the date
            to a time, the date should NOT have trailing hyphens, whilst
            still having four units, i.e. the right hand end MUST be
            the Day Number digits.

            This only works for a four digit year system. For all years
            beyond 9999, the standard has to be rewritten to add formats
            with the Year stated as six digits, an extra leading hyphen
            to be added to every one of the above formats, and with the
            format checking rule updated to say that there are now FIVE
            units, rather than only four.

            I don't propose this as a replacement to ISO 8601, but just
            mention it as all these points apply to the current standard
            in some way.



            The more I look at things, the more I am convinced that
            including YYMMDD rather than YYYYMM is the root cause of
            most of the problem. This looks like being a Y2K problem
            in another guise. The original standard was written well
            before the Year 2000. Including YYMMDD simply pandered
            to the, then current, vogue for a 'default' of a two-digit
            year, without thought for the overall logic of the whole
            document. The new standard makes reference to using all
            four digits for the Year to avoid these sorts of problems,
            but this could also have helped in revising the logic of
            the standard... by suggesting that users go back to using
            -YYMMDD in preference to just YYMMDD alone, or, even
            better still, just adopt the full four digit year. This
            would then allow 121212 to be 'YYYYMM' as originally, and
            logically, expected.



            Now, let's re-do those first two tables above for the Ordinal
            Day of Year Format. Note that I have repeated formats like
            YY and YYYY from the first table for completeness; you can't
            tell if YYYY is part of a 'normal' Gregorian Calendar date,
            an Ordinal Date, or a 'Week Number and Day of Week' date, so
            I repeat them here, whereas the official standard does not:


            YYYY-DDD YYYY-DDD YYYYDDD YYYYDDD
            -------- -------- ------- -------

            a YY 12 YY 12 a
            b YYYY 1212 YYYY 1212 b
            c YYYY-DDD 1212-121 YYYYDDD 1212121 c
            d -YY -12 -YY -12 d
            e z YY 12 z YY 12 e
            f x! -YY-DDD -12-121 x! -YYDDD -12121 f
            g ! YY-DDD 12-121 ! YYDDD 12121 g
            h x! --DDD --121 x! --DDD --121 h
            i ! -DDD -121 ! -DDD -121 i
            j z DDD 121 z DDD 121 j
            k x (-)(-)DD 12 x (-)(-)DD 12 k
            l x (-)(-)D 1 x (-)(-)D 1 l

            An additional note must also state that leading hyphens only
            replace elements, and are never separators, for this to work.
            Again 'z' refers to 'mutual agreement' formats, and 'x' means
            not permitted.


            Notes and Explanations:
            -----------------------

            LEFT: RIGHT:

            c - complete c - 7 digits: unambiguous.
            e - used by a e - used by a
            f - is the logical default, but f - is the logical default, but
            g - does not conflict with g - does not conflict with
            anything else anything else (5 digits).
            h - logical default, but h - logical default, but
            i - ISO dropped one leading '-' i - ISO dropped one leading '-'
            j - mutual agreement, but three j - mutual agreement, but three
            digits are unambiguous anyway digits are unambiguous anyway
            k - not allowed, must be 3 digits k - not allowed, must be 3 digits
            l - not allowed, must be 3 digits l - not allowed, must be 3 digits

            The Year can be two or four digits (with a leading hyphen
            mandatory for some two digit formats), the Day of Year must
            be three digits.

            You would expect 'g' to be by 'mutual agreement' and for
            'i' to be 'not permitted' if you compare this table with
            the one for the Gregorian Calendar Date, but 4.6 and 4.9
            override this. Entry 'j' is still 'by mutual agreement'
            as would be expected.



            The table can be rearranged to ask what a numerical format
            should be decoded as. To keep it simple, I have not divided
            it into Basic and Extended formats. Anything with a hyphen
            between elements is an Extended format. Writing the table
            in this way, I have included some formats that the ISO
            standard says are 'Not Applicable'. There cannot be a way
            to tell if '1950' is supposed to be a Basic format Year or
            an Extended format Year. I have ignored this and included
            it under both styles. The table produces (again, a, b, etc,
            refer to notes after) the following result:


            ALLOWED DISALLOWED
            ------- ----------

            1212-121 YYYY-DDD
            1212 YYYY x YDDD
            12 YY (19 of 1950) z YY (50 of 1950)

            1212121 YYYYDDD
            1212 YYYY x YDDD
            12 YY (19 of 1950) z YY (50 of 1950)

            1212-121 YYYY-DDD
            12-121 a YY-DDD !
            121 n/a ! z DDD x YYY

            1212121 YYYYDDD
            12121 b YYDDD !
            121 n/a ! z DDD x YYY

            -12-121 c n/a ! x! -YY-DDD (use YY-DDD) !
            -121 e -DDD ! x -YYY (-YY or YYYY)
            -12 -YY (50 of 1950) x -DD (must be DDD)

            -12121 d n/a ! x! -YYDDD ! (use YYDDD) !
            -121 e -DDD ! x -YYY (-YY or YYYY)
            -12 -YY (50 of 1950) x -DD (must be DDD)

            --121 f n/a ! x! --DDD ! (use -DDD) !
            --12 g n/a x --DD (must be DDD)


            Notes and Explanations:
            -----------------------

            At a, ISO automatically dropped the hyphen of -YY-DDD to
            make YY-DDD, just as they did at Gregorian Date 'a'.

            At b, ISO automatically dropped the hyphen of -YYDDD to
            make YYDDD, just as they did at Gregorian Date 'b'.

            For c and d, the disallowed format is the one that you would
            be logically expecting to be allowed, but see notes a and b.

            For e, you would logically expect --DDD, but with no risk
            of misinterpretation ISO automatically dropped the first
            hyphen, leaving -DDD. You could drop all hyphens, because
            this is the only three digit element in the standard
            (unless by mutual agreement you are exchanging YYY, or you
            have dropped the leading W from W527), but ISO insist here
            in having a minimum of one leading hyphen in place of all
            of the missing elements.

            For f, the disallowed format is the one that you would
            be logically expecting to be allowed, but see note e.

            At g, --12 always means --MM for Gregorian Calendar Date;
            two digit day is not allowed here.

            And, z, just confirms the old 'by mutual agreement you can
            drop leading hyphens for any of these date formats' rule,
            and use all the meanings that I have currently placed in the
            'disallowed' column if you need to do so.

            If a line has an 'x', then entries to the right of the 'x'
            in the 'disallowed' column are never allowed to be used.

            For anything marked 'n/a !' in the allowed column, the '!'
            points to surprise at the blank; that ISO have disallowed
            what you would logically expect to be there.

            A date format in the allowed column, with a '!' included, you
            would logically expect to be disallowed, but ISO included it.

            Having allowed YYMMDD in, you would expect YY on its own to
            simply be the 50 of 1950, but it actually represents the 19
            part.

            Of course, whilst formats here for Ordinal Day of Year should
            not conflict with each other, they also must not conflict with
            any format already being used for Gregorian Calendar Date, and
            vice versa. Ditto for the 'Week Number and Day of Week' format.



            There is no need to do the tables for the Week format, because
            the letter 'W' built in, before the week number, will always
            show what is going on, and the single digit day (even if it is
            on its own) has to be the Day of Week, because a single digit
            is not allowed to be used for any other Date (or Time)
            representation (except by 'mutual agreement'...).

            However, looking closely at 5.2.3.3 (c) and (d), reveals
            formats with a SINGLE digit YEAR. This is something not
            mentioned at all for Gregorian or Ordinal date. Why should
            the Week/Day format be any different? There is absolutely
            no explanation here, other than to say that I believe that
            the 1988 version of the standard used to say that formats
            were not limited to the examples listed in the standard,
            just as long as all formats followed the general rules
            about element ordering, use of separators, and consistent
            element length achieved by use of leading zeroes where
            required, etc.

            So, I guess that formats like YYY and Y-DDD and YYY-DDD
            are allowed 'by mutual agreement' but their Basic Formats
            of YYY and YDDD and YYYDDD have a real risk of being
            interpreted, wrongly, as DDD and YYYY and YYMMDD (or
            YYYYMM) respectively.



            Para 4.5, Note 1 says 'The hyphen may also be used to
            indicate omitted components', but it doesn't say whether
            these are components omitted on the left, right, or in the
            middle of a date. It isn't permitted to omit components
            from the middle of a date, but the note does give the
            initial impression that 1999-- is just as valid a date
            format as --1231 is.



            All of these complications with ISO 8601 are why, when most
            people implement the standard, they actually just list the
            formats that they are going to allow, these usually being
            just a small subset of what is actually possible. I *always*
            use a four digit year, for example.



            A BIG MISTAKE. This paragraph, and the next three, have been
            inserted after I had written the whole of this message, and
            was just about to hit the 'Send' button.

            Basic Formats do not include separators. So, how come para
            5.2.1.2 (a) does include a hyphen separator? The YYYY-MM
            format is a *Basic* Format according to the Standard. How
            can that be right? Extended Formats include separators.
            Basic Formats do not include separators, except YYYY-MM
            we are supposed to believe.

            Stand up ISO. You have been caught out. The choice of defined
            formats appears to be arbitrary. I expected to see, and have
            always understood it to be so, that the choice of formats
            was based on a logical pattern. This does not appear to be
            the case. I had already spotted a couple of dodges (regarding
            YYMMDD, YYDDD, and I thought YYYYMM), but this one point,
            'YYYY-MM', has eluded me for a very long time (circa 7
            years!). I have not adjusted my tables to reflect this
            last point, so you can work out for yourself the effect
            that the ISO kludge has on the logic. To be fair, I don't
            think ISO have ever claimed that the standard was based
            on a logical pattern, other then the highest element first
            'Year-Month-Day-Hour-Minute-Second' element ordering, and
            unambiguous representation of dates and times and zones.

            This definition for YYYY-MM carries over from the earlier
            1988 standard which can be downloaded from my FTP site at:
            <ftp://ftp.qsl.net/pub/g1smd/> if you haven't already got
            it. It means that in my first table, everything on the LEFT
            is an Extended Format, except that the entry at LEFT 'c'
            is to be included on the RIGHT and be called a Basic Format.
            That just does not make any sense at all. And, it's all
            because they wanted to define 121212 as YYMMDD, rather than
            accepting YYYYMM. The 'similar' YYYY-DDD (Ordinal Date)
            format is correctly listed as an Extended Format, so this
            listing of YYYY-MM as a Basic Format just does not make
            any sense at all. I've been duped!



            This is the most complicated thing I have written for a few
            weeks. I really hope I have caught all of the typos!



            Cheers,

            Ian.


            <mail://g1smd@...>

            <http://www.qsl.net/g1smd/>
            <http://home.freeuk.net/g1smd/>
            <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

            <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
            <ftp://ftp.qsl.net/pub/g1smd/>


            [2001-07-16]

            .end
          • Pete Forman
            ... No, were an expanded representation to be used for a six digit year it would be +121212. ... As I said in my previous message I consider that there are two
            Message 5 of 15 , Jul 17, 2001
              g1smd@... writes:
              > [most of the material snipped]

              > I agree that this is a bit sloppy. It needs rewriting, or some
              > additional notes. The minimum required, would be to state that the
              > year may be specified by two, or by four, or by more digits; and I
              > see a problem here... 121212 is assumed to be YYMMDD, but this
              > could be the YYYYYY 121212.

              No, were an expanded representation to be used for a six digit year it
              would be +121212.


              > Also, the 'mutual agreement' problem appears here again.
              > Representations that have the prescribed leading hyphens omitted
              > can be used only by mutual agreement... except that the format at j
              > seems to be the default, rather than the format stated in i. That
              > is, for all the others, mutual agreement is required, but for j it
              > has already been forced upon us to agree to this. In doing this,
              > you get the 'logic error' with the RIGHT c entry being disallowed
              > (in order to satisfy the {non written, as far as I can see} rule
              > that any representation can have only one implied meaning unless
              > mutual agreement has already been obtained).

              As I said in my previous message I consider that there are two
              possible reasons for omitting hyphens. Mutual agreement is one. The
              other is that hyphens should/must only be used to stand for omitted
              components in order to disambiguate.


              > I guess they had to include YYMMDD and exclude YYYYMM simply
              > because millions of computer systems were already using YYMMDD.

              Not necessarily. We are talking about a date format, it is reasonable
              to give preference to the full precision interpretation over the
              reduced one.

              In general ISO 8601 does not say that two digit years are evil. It
              passes them off as a specific case of a truncated representation.

              > The table can be rearranged to ask what a numerical format should
              > be decoded as. To keep it simple, I have not divided it into Basic
              > and Extended formats. Anything with a hyphen between elements is an
              > Extended format.

              As you probably realise, that contradicts 5.2.1.2.

              > Writing the table this way, I have included some formats that the
              > ISO standard says are 'Not Applicable'. There cannot be a way to
              > tell if '1950' is supposed to be a Basic format Year or an Extended
              > format Year. I have ignored this and included it under both styles.

              Again, the standard is clear that '1950' is basic format only. I take
              'Not Applicable' to mean "don't use this".




              What we could do with is a rationale for the standard. I wonder if
              one was produced.

              The other useful production would be a general reader. Given any
              input string it should be possible to determine whether it is basic or
              extended, full or reduced precision, expanded or not, truncated or
              not, calendar or ordinal or week. (At a higher level we need to
              determine whether a string is a date, time, interval, etc.)

              A start for this might be

              Parse as date:
              Does it contain a 'W'?
              => parse as week date
              else Does it have an even number of digits? **
              => parse as calendar date
              else
              => parse as ordinal date

              Parse as calendar date:
              Split into fields of hyphen or pair-of-digits or plus (1st only)
              Match against candidate formats

              Parse as ordinal date:
              Split into fields of hyphen or pair-of-digits or plus (1st only)
              or triple-of-digits (last only)
              Match against candidate formats

              **This assumes that expanded formats use an even number of digits for
              the year. A different approach might tolerate an odd number.
              Actually, in order to parse expanded formats the number of digits
              for the year must be known otherwise there is no way to distinguish
              days from years from centuries. Years before 0000 are problematic
              as well. But according to 4.3.2.1 mutual agreement is needed for
              years prior to 1582 anyway.

              The table for calendar dates then starts

              Number of Fields Format Section Note
              fields
              1 + illegal
              1 - illegal
              1 2 YY 5.2.1.2.c.B
              2 + - illegal
              2 + 2 +YY 5.2.1.4.d.B 1
              2 - - illegal
              2 - 2 -YY 5.2.1.3.c.B 2
              2 2 - illegal
              2 2 2 YYYY 5.2.1.2.b.B
              ...
              4 2 2 2 2 YYYYMMDD 5.2.1.1.B
              ...
              6 2 2 - 2 - 2 YYYY-MM-DD 5.2.1.1.E

              Notes:
              1 Implicitly assume that expanded representation years have 4 digits
              2 Implicitly assume that expanded representation years are positive
              or have more that 4 digits


              Different versions of tables would be needed for different expanded
              representations. Expanded and truncated representations are mutually
              exclusive. The agreement between parties has to be inspected to
              establish whether a leading hyphen means a negative year or truncated
              representation.
              --
              Pete Forman -./\.- Disclaimer: This post is originated
              WesternGeco -./\.- by myself and does not represent
              pete.forman@... -./\.- opinion of Schlumberger, Baker
              http://www.crosswinds.net/~petef -./\.- Hughes or their divisions.
            • P A Hill & E V Goodall
              ... Very good! That does qualify as stating that are all tying to be unique. ... Yes that is how I read all of the paragraphs that read like that: If, by
              Message 6 of 15 , Jul 17, 2001
                Pete Forman wrote:
                >
                > P A Hill & E V Goodall writes:
                > > [snip]
                > > In fact, the standard before the first truncated format in the
                > > opening paragraph of 5.2.1.3 says "In each case hyphens that
                > > indicate components should be used only as indicated or shall be
                > > omitted."
                > >
                > > That to me hints that some of choices are arbitrary, so don't play
                > > around with them. Also, there is no place that states that all
                > > formats are mutually unambiguous from each other.
                >
                > How about the last paragraph in 4.1.

                Very good! That does qualify as stating that are all tying to be unique.

                > Add hyphens where a format is ambiguous. This is bound to be
                > arbitrary: if two formats collide one must be chosen to get the
                > extra hyphen.

                >
                > Note that omitting hyphens by these rules is a separate issue to
                > 5.2.1.3. I take the latter to mean that the communicating parties
                > agree that, for example, two digits alone mean a month rather than
                > using four characters of 5.2.1.3.e proper.

                Yes that is how I read all of the paragraphs that read like that:

                "If, by agreement, truncated representations are used the basic formats shall
                be as specified below. In each case hyphens that indicate omitted components
                shall be used only as indicated or shall be omitted."

                It also tells anyone who wants to claim to be 8601 compliant to not to make up
                a format that might be read just like one of the real formats but missing
                only some of the hyphens.

                For example, I can't claim to be 8601 compliant and take one leading
                hyphen off of each of examples in 5.2.1.3.

                Thanks for the comments on how you read 8601.

                -Paul
              • P A Hill & E V Goodall
                ... Actually, I was assuming just mutually unambiguous choices had been made, so was not expecting a particular format. I was thrown off by the writers of the
                Message 7 of 15 , Jul 17, 2001
                  g1smd@... wrote:
                  > > The note at 5.2.3.3 would not list only one format, but mention
                  > > all of those which one might think might have a leading dash
                  > > for missing 'century' and another for missing year pointing
                  > > out the simplification.
                  >
                  > Now I see what you are saying, I agree that the wording here
                  > is sub-optimal. You reach a place where you see a format you
                  > were not expecting, with no previous rationale as to why the
                  > format is shown like it is. Yes, the standard is deficient
                  > (unless 4.9 is where its at?) and requires extra notes.

                  Actually, I was assuming just mutually unambiguous choices
                  had been made, so was not expecting a particular format. I was
                  thrown off by the writers of the standard expecting a particular
                  format.

                  > I wasn't sure why you were hung up on this one word 'should'.
                  > Now you have explained more, then I am happy to agree with you.
                  > You are right. Although the standard works the way I have said,
                  > and the examples follow the method I have stated, nowhere in
                  > the standard does it state clearly that this is the case, or
                  > why it should be so, and several notes of clarification are
                  > obviously missing on a few examples.

                  I'm glad we got that worked out! Yes, it was more an editorial
                  analysis stated as "did I miss something", then a criticism of
                  a particular format.

                  Thanks for the interesting tables! I was just starting to
                  work on some like these myself.

                  -Paul
                • P A Hill & E V Goodall
                  ... Let s make that: That does qualify as stating that all are trying to be unique. -Paul
                  Message 8 of 15 , Jul 18, 2001
                    P A Hill & E V Goodall wrote:
                    > That does qualify as stating that are all tying to be unique.

                    Let's make that:

                    That does qualify as stating that all are trying to be unique.

                    -Paul
                  • g1smd@amsat.org
                    On 2001-Jul-16 Pete Forman wrote: [2001-Aug-01] ... I agree with that. I didn t find the words *mutually* unambiguous . Instead, I found ... unique and
                    Message 9 of 15 , Aug 1, 2001
                      On 2001-Jul-16 Pete Forman wrote:


                      [2001-Aug-01]



                      >> In fact, the standard before the first truncated format in the
                      >> opening paragraph of 5.2.1.3 says "In each case hyphens that
                      >> indicate components should be used only as indicated or shall be
                      >> omitted."

                      >> That to me hints that some of choices are arbitrary, so don't play
                      >> around with them. Also, there is no place that states that all
                      >> formats are mutually unambiguous from each other. As I was reading
                      >> I was looking for just such a statement or examples that violated
                      >> the idea. I found neither, but that is no proof.

                      > How about the last paragraph in 4.1.

                      I agree with that. I didn't find the words '*mutually* unambiguous'.
                      Instead, I found '... unique and unambiguous', which I think just
                      about does the same job.



                      > I agree that the choices seem arbitrary. There may be some logic
                      > behind it though. My guess is that the rules are something like

                      > Replace an omitted component in a truncated format with a hyphen.
                      > Component may be century (first two digits of a four digit year)
                      > or decade (first three digits of a four digit year) or last two
                      > digits of the year or month or week. The term component is not
                      > defined as such but the components are listed for each of the
                      > truncated representations.

                      Not quite. I think that you appear to say that -1 is a year like
                      1981 or 2001, stated by omitting the decade. Adding the month to
                      this, to increase precision, will make -111. This can now be
                      confused with -DDD, day 111 of the year. So, you should modify
                      your statement to say that: (except for Day of Year [DDD] elements,
                      and Day of Week [D] elements) elements should always have an even
                      number of digits: YYYY, YYMM, MMDD, YYMMDD, etc. However, I can
                      see where you may have got this idea from. In the examples for
                      the various 'Week-of-Year and Day-of-Week' formats, there are
                      some three and some single digit year examples. However, in those
                      examples, the placement of the 'W' always clarifies what is going
                      on. In the Calendar and Ordinal date formats you cannot do this
                      with Basic Formats. The Year must be two or four digits, except
                      for some Extended Format dates that can have a three or single
                      digit year, because these cannot be mixed up with other formats:
                      -Y-DDD -YYY-DDD -Y-MM-DD -YYY-MM-DD and possibly -Y-MM and
                      -YYY-MM (and you can probably omit the leading hyphen on all
                      of these and get away with it). For most of these it is *not*
                      possible to have a Basic format (if 'Basic' formats are taken
                      to mean that hyphen separators *between* digits are omitted),
                      as these *will* then be confused with other pre-defined formats.



                      > Remove hyphens if the result is unambiguous.
                      > 4.6 para 1 states that a hyphen may be necessary to represent
                      > an omitted component. That implies to me that the hyphen
                      > should not be used if possible.

                      This 'ruling' seems arbitrary in the standard. A format like
                      12-12 isn't permitted at all, but 121212 is read as YYMMDD,
                      when I would expect YYYYMM to be the one. Similarly -121212
                      doesn't appear anywhere, when I think that -YYMMDD is expected
                      (like -1212 is -YYMM, for example).



                      > (The above two rules may also be expressed as: Components may
                      > be omitted, if the result is ambiguous then use a hyphen to
                      > stand for the omitted component.)

                      I almost agree with this, but I still don't understand why
                      121212 has to be YYMMDD, when YYYYMM would be more logical,
                      and this would then follow a 'pattern' with the other formats.
                      See the various tables that I included in my previous message,
                      posted 2001-Jul-16.



                      > Add hyphens where a format is ambiguous. This is bound to be
                      > arbitrary: if two formats collide one must be chosen to get
                      > the extra hyphen.

                      It isn't always arbitrary. '12' is always the first two digits
                      of a Year, so the last two digits of the Year are -12, the Month
                      is --12, and the Day is ---12. This is clear and logical. Note
                      that 12-12 isn't permitted at all, as it could be either YY-MM
                      or MM-DD. Instead -YY-MM and --MM-DD are used; so none of them
                      'get the extra hyphen' (in this context)... they both include a
                      hyphen or hyphens. So, two formats collide at '12-12' and rather
                      then one gets a hyphen, and the other doesn't, then in fact the
                      '12-12' format isn't defined/used at all.



                      > Note that omitting hyphens by these rules is a separate issue to
                      > 5.2.1.3. I take the latter to mean that the communicating parties
                      > agree that, for example, two digits alone mean a month rather than
                      > using four characters of 5.2.1.3.e proper.

                      Yes, by mutual agreement I can say that '12' in one data element
                      is MM, and in another is DD, rather than the default of YY.



                      What comments have you got regarding the material in my message
                      dated 2001-Jul-16, under the heading 'A BIG MISTAKE'?



                      Cheers,

                      Ian.


                      <mail://g1smd@...>

                      <http://www.qsl.net/g1smd/>
                      <http://home.freeuk.net/g1smd/>
                      <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

                      <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
                      <ftp://ftp.qsl.net/pub/g1smd/>


                      [2001-08-01]

                      .end
                    • g1smd@amsat.org
                      On 2001-Jul-17 Pete Forman wrote: [2001-Aug-01] ... but I forgot to repeat that note with the above text. ... Unfortunately, since ISO mixed their logic in
                      Message 10 of 15 , Aug 1, 2001
                        On 2001-Jul-17 Pete Forman wrote:


                        [2001-Aug-01]



                        >> I agree that this is a bit sloppy. It needs rewriting, or some
                        >> additional notes. The minimum required, would be to state that
                        >> the year may be specified by two, or by four, or by more digits;
                        >> and I see a problem here... 121212 is assumed to be YYMMDD,
                        >> but this could be the YYYYYY 121212.

                        > No, were an expanded representation to be used for a six digit
                        > year it would be +121212.

                        I did already refer to this in another paragraph, where I noted:
                        >>> .... I see a problem here... 121212 is assumed to be YYMMDD,
                        >>> but this could be the YYYYYY 121212. Having re-read the
                        >>> standard I see that para 4.7 does cover this. Additionally,
                        >>> para 4.8 does say that elements do all have a defined length,
                        >>> and that leading zeroes must be used to fulfil this. ....
                        but I forgot to repeat that note with the above text.



                        >> Also, the 'mutual agreement' problem appears here again.
                        >> Representations that have the prescribed leading hyphens omitted
                        >> can be used only by mutual agreement... except that the format at j
                        >> seems to be the default, rather than the format stated in i. That
                        >> is, for all the others, mutual agreement is required, but for j it
                        >> has already been forced upon us to agree to this. In doing this,
                        >> you get the 'logic error' with the RIGHT c entry being disallowed
                        >> (in order to satisfy the {non written, as far as I can see} rule
                        >> that any representation can have only one implied meaning unless
                        >> mutual agreement has already been obtained).

                        > As I said in my previous message I consider that there are two
                        > possible reasons for omitting hyphens. Mutual agreement is one.
                        > The other is that hyphens should/must only be used to stand for
                        > omitted components in order to disambiguate.

                        Unfortunately, since ISO mixed their logic in deciding on YYMMDD
                        over YYYYYMM, this skews the expected logical 'pattern' of allowed
                        formats, as shown in the tables in my message posted 2001-Jul-16.
                        I think their choices of 'default' formats are somewhat arbitrary.



                        >> I guess they had to include YYMMDD and exclude YYYYMM simply
                        >> because millions of computer systems were already using YYMMDD.

                        > Not necessarily. We are talking about a date format, it
                        > is reasonable to give preference to the full precision
                        > interpretation over the reduced one.

                        So why isn't 1212 decoded as MMDD, instead of YYYY?
                        It seems very odd to me, that the formats go:
                        12121212 YYYYMMDD
                        121212 YYMMDD
                        1212 YYYY
                        12 YY (19 of 1950)
                        Surely, life would be much easier if 121212 were YYYYMM?

                        I was expecting one of the following patterns:
                        12121212 YYYYMMDD
                        121212 YYYYMM
                        1212 YYYY
                        12 YY (19 of 1950)
                        or:
                        12121212 YYYYMMDD
                        121212 YYMMDD
                        1212 MMDD
                        12 DD
                        or:
                        12121212 YYYYMMDD
                        121212 YYMMDD
                        1212 YYYY
                        12 YY (50 of 1950)
                        The last three all have a logical pattern to them, whereas
                        the first table (as derived from the ISO 8601 standard) does
                        not have a logical pattern. Have another look at the various
                        tables in my previous message (the one dated 2001-Jul-16)
                        for further information.



                        > In general ISO 8601 does not say that two digit years are evil. It
                        > passes them off as a specific case of a truncated representation.

                        Most formats that have a two digit year have a leading hyphen.
                        Only YYMMDD does not, at the expense of YYYYMM being disallowed.
                        I don't understand why.



                        >> The table can be rearranged to ask what a numerical format should
                        >> be decoded as. To keep it simple, I have not divided it into Basic
                        >> and Extended formats. Anything with a hyphen between elements is an
                        >> Extended format.

                        > As you probably realise, that contradicts 5.2.1.2.

                        That is so illogical. What is a Basic Format? What is an Extended Format?

                        A simple answer would be (you would think) that an Extended Format
                        includes separators between elements, and a Basic Format always has
                        them omitted. However, because someone at ISO decided that 121212
                        would be YYMMDD (the Basic version of YY-MM-DD), then YYYYMM has been
                        disallowed. A Year and Month always has to have a hyphen separator:
                        YYYY-MM. But why is it then called a Basic Format? This is the only
                        Basic Format in the whole standard that includes any separators.

                        I repeat, again, just what is a Basic Format? Give me a simple
                        definition. Hyphen separators are not it; unless ISO have made
                        a mistake and it is meant to be:

                        Year and Month:
                        ---------------
                        *Extended* Format: YYYY-MM
                        Basic Format: Not Applicable (because 121212 is YYMMDD)

                        but as already stated, I think ISO made a fundamental error in
                        allowing YYMMDD over YYYYMM in the first place. That is where
                        the heart of the whole problem lies.



                        >> Writing the table this way, I have included some formats that the
                        >> ISO standard says are 'Not Applicable'. There cannot be a way to
                        >> tell if '1950' is supposed to be a Basic format Year or an Extended
                        >> format Year. I have ignored this and included it under both styles.

                        > Again, the standard is clear that '1950' is basic format only.
                        > I take 'Not Applicable' to mean "don't use this".

                        Take a date like 1212-12-12, reduce the precision to 1212-12,
                        then to 1212. Now do the same with 12121212, reduce to 1212-12
                        (121212 not allowed!!), then to 1212. So, 1212-12-12 is an
                        Extended Format, and 12121212 is a Basic Format; but both reduce
                        to 1212 for just the Year. So, really, 1212 could be a Basic
                        Format or an Extended Format, there is no way to tell. What I
                        think the ISO standard means by 'Not Applicable' is simply that
                        because 1212 does not contain any hyphen separators; in other
                        words, that is, because 1212 (Extended) is exactly the same as
                        1212 (Basic) (i.e. the Extended Format does not have it's own
                        unique definition), then there is no need to repeat the
                        definition that was shown for the Basic Format. So I think
                        that 'Not Applicable' really just means that there is no unique
                        representation to show for the Extended Format, so you just use
                        the same format as is already listed for the Basic Format.
                        However, I am also assuming that the difference between an
                        Extended Format and a Basic Format is that the Basic Format
                        does not include any hyphens used as separators.



                        > What we could do with is a rationale for the standard. I wonder
                        > if one was produced.

                        > The other useful production would be a general reader. Given any
                        > input string it should be possible to determine whether it is basic
                        > or extended, full or reduced precision, expanded or not, truncated
                        > or not, calendar or ordinal or week. (At a higher level we need to
                        > determine whether a string is a date, time, interval, etc.)

                        There is NO pattern to the ISO standard. Many of the choices
                        are arbitrary... viz YYMMDD vs YYYYMM and so on. This makes
                        finding a 'simple' rule impossible.



                        > A start for this might be

                        > Parse as date:
                        > Does it contain a 'W'?
                        > => parse as week date
                        > else Does it have an even number of digits? **
                        > => parse as calendar date
                        > else
                        > => parse as ordinal date

                        > Parse as calendar date:
                        > Split into fields of hyphen or pair-of-digits or plus (1st only)
                        > Match against candidate formats

                        > Parse as ordinal date:
                        > Split into fields of hyphen or pair-of-digits or plus (1st only)
                        > or triple-of-digits (last only)
                        > Match against candidate formats

                        >**This assumes that expanded formats use an even number of digits for
                        > the year. A different approach might tolerate an odd number.
                        > Actually, in order to parse expanded formats the number of digits
                        > for the year must be known otherwise there is no way to distinguish
                        > days from years from centuries. Years before 0000 are problematic
                        > as well. But according to 4.3.2.1 mutual agreement is needed for
                        > years prior to 1582 anyway.

                        > The table for calendar dates then starts:

                        > Number of Fields Format Section Note
                        > fields
                        > 1 + illegal
                        > 1 - illegal
                        > 1 2 YY 5.2.1.2.c.B
                        > 2 + - illegal
                        > 2 + 2 +YY 5.2.1.4.d.B 1
                        > 2 - - illegal
                        > 2 - 2 -YY 5.2.1.3.c.B 2
                        > 2 2 - illegal
                        > 2 2 2 YYYY 5.2.1.2.b.B
                        > ...
                        > 4 2 2 2 2 YYYYMMDD 5.2.1.1.B
                        > ...
                        > 6 2 2 - 2 - 2 YYYY-MM-DD 5.2.1.1.E

                        I see that your table deals only with stuff that begins with the
                        Year. That is all easy. See if you can finish it, when you deal
                        with left-truncated stuff: both full and reduced precision.
                        It becomes a LOT more difficult.



                        > Notes:
                        > 1 Implicitly assume that expanded representation years have 4 digits
                        > 2 Implicitly assume that expanded representation years are positive
                        > or have more that 4 digits

                        > Different versions of tables would be needed for different expanded
                        > representations. Expanded and truncated representations are mutually
                        > exclusive. The agreement between parties has to be inspected to
                        > establish whether a leading hyphen means a negative year or truncated
                        > representation.

                        It gets very complicated doesn't it. My tables of Allowed and
                        Disallowed formats in the message dated 2001-Jul-16 may help
                        to guide you to look for logic errors.



                        Cheers,

                        Ian.


                        <mail://g1smd@...>

                        <http://www.qsl.net/g1smd/>
                        <http://home.freeuk.net/g1smd/>
                        <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

                        <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
                        <ftp://ftp.qsl.net/pub/g1smd/>


                        [2001-08-01]

                        .end
                      • P A Hill & E V Goodall
                        ... I think is why increased precision is only done when the exchanging parties agree. I think this gets around the problem, that given some arbitrary sequence
                        Message 11 of 15 , Aug 2, 2001
                          g1smd@... wrote:
                          > Adding the month to
                          > this, to increase precision, will make -111. This can now be
                          > confused with -DDD, day 111 of the year.

                          I think is why increased precision is only done when the exchanging
                          parties agree. I think this gets around the problem, that given some
                          arbitrary sequence can we guess what it is.

                          -Paul
                        Your message has been successfully submitted and would be delivered to recipients shortly.