Loading ...
Sorry, an error occurred while loading the content.

[ISO8601] Re: Clarifications: 5.2.2.2

Expand Messages
  • g1smd@amsat.org
    ... [2001-Jul-14] ... I gave the address of the document, and the fact that it is the final draft version of the standard, for all the other people out there
    Message 1 of 15 , Jul 13, 2001
    • 0 Attachment
      >> On 2001-Jul-13 Paul Hill <goodhill@...> wrote:


      [2001-Jul-14]



      >>> I was reading ISO/TC 154 N 362
      >>> ISO8601:2000(E)2000-12-19

      >> The ISO/TC 154 N 362 [PDF] document to which you refer is the
      >> final draft version of the ISO 8601:2000 standard. The published
      >> version of ISO 8601:2000 is not available online, but the draft
      >> (from only four days previously) can still be downloaded from:
      >> <http://lists.ebxml.org/archives/ebxml-core/200104/msg00252.html>.

      > Are you suggesting I don't have this final draft? What evidence do
      > you have of that?

      I gave the address of the document, and the fact that it is the final
      draft version of the standard, for all the other people out there who
      may have read your message and not have known what 'ISO/TC 154 N 362'
      actually refers to. I am not sure why you think I was suggesting
      anything else. I thought I already explained that with the following
      words...

      >> This is for other people reading this, who may wish to get their
      >> own copy of this document.



      >> There are several minor typos, and
      >> some formatting problems in this draft document. I do not know
      >> if any of those have been corrected in the published version.

      That is an important point. Have you checked this out?



      >> To answer your question: They are not referring to the Extended
      >> format. They are indeed referring, correctly, to the Basic format.

      > There is nothing "correctly", either they are or they aren't
      > referring to the extended format.

      > Thank you for you clarification.

      They are referring to the Basic Format. And there is every
      possibility to be incorrect. They are referring to it, and
      in this case they are correct in referring to it. They could
      have been referring to it, and been incorrect in referring to
      it. Or they could have not been referring to it, which may
      have been correct, or it may have been incorrect. They are
      referring to the Basic format, and they are correct in
      referring to the Basic format, because that is what the
      paragraph is actually supposed to be referring to. Why is
      this so difficult? What version of the English language are
      you using? Are you in the Pedantic Club? I was a Founder
      Member <g>. I *designed* the T-shirt! ONLY JOKING!!



      >> The hyphen in this application is NOT a separator, it replaces
      >> missing data elements. They used to make a distinction between
      >> 'Century' and 'Year', referring to 'CCYY'. When referring to a
      >> two digit year, a leading hyphen is supposed to be used. So,
      >> '-99' was Year 99 in the current 'Century'. Therefore, when
      >> specifying only the day of the year, --DDD is the correct form;
      >> there are two missing elements, the 'Century' and the 'Year'.

      > This while valid sounding is obviously incorrect.
      > I am familiar with the old CCYY idea, but a quick check of the
      > just preceding example and formats in the same section
      > reveal that leaving out the 'century' does not lead to the
      > insertion of a dash.

      Why is it *obviously* incorrect? Maybe this bit really is the
      correct bit, and the other bits you refer to are actually
      the ones that are incorrect? I am not saying that is how it
      is, but it *is* a possibility. As it happens *both* parts are
      correct. Read on...



      > <quote>
      > a) A specific year and day in the implied century
      > Basic format: YYDDD EXAMPLE 85102

      This is correct, because a 5 digit number can only be formed
      by a two digit year, and a three digit 'day of year' number.
      As there is no other possibility, the leading hyphen can be
      omitted without risk of confusion with any other formats.
      So, 85102 and -85102 would mean exactly the same thing, but
      the -85102 form is not used.



      > Extended format: YY-DDD EXAMPLE 85-102

      Ditto.



      > b) Day only in the implied year
      > Basic format: -DDD EXAMPLE -102

      The same applies, but logically --102 should be used. You
      can get away with -102 without risk of misinterpretation.

      You could omit all hyphens here, and still know it is the Day
      of Year number, but ISO insist on always having some digits or
      a hyphen in the Year position, whereas they don't insist on a
      hyphen if just the 'century' is missing. I put 'century' in
      quotes because I do not like using that word.

      > Extended format: not applicable
      > </quote>



      > Note that in the above that the YY-DDD form does not have any
      > leading dash in either these truncated basic or truncated
      > extended version.

      Section 4.6 refers. 'These leading hyphens may be omitted in
      the applications where there is no risk of confusing these
      representations with others'. Any leading hyphen would always
      be to replace a missing element. Hyphens between elements are
      separators in Extended Formats. There are no separators in the
      Basic formats. There are never any separators before the first
      element, only 'replacement' hyphens for missing elements.



      > So, therefore you have not justified the use of the word
      > 'should' in the original quote.

      I don't want to have an argument about each individual syllable,
      and comma of my previous email. Life is too short. It's all in
      paragraph 4.6 as far as I can see.



      >> In ISO 8601:2000 they have dropped the term 'Century' because,
      >> for example, the Year 2001, has a 'CC' of '20' but is clearly
      >> in Century '21' in common parlance.

      > Not counting that while 2001 is in the 21st, 2000 is in the
      > 20th Century in traditional century counting. I say
      > traditional, because by any obvious measure the "common"
      > celebration of the century change happened when the digits
      > changed back at 2000-01-01 T 00:00:00

      The word 'Century' and the use of 'CCYY' has caused too many
      problems. Thankfully only 'YYYY' is now used, and many people
      now also completely avoid the use of any two digit year formats
      in 'real life'. I try to avoid the word 'Century' in everything
      that I do, and I always write, program, type, print all four
      digits for the year.



      > The problem is that the document still contains references to
      > the term 'century', but this term is not defined to be the
      > hundreds part of a four digit year specification. I think the
      > problem could be solved by defining either 'century' to mean
      > the numerical part above 100 or consistently use a term like
      > 'numerical century' and define that, either way, now that CC
      > doesn't appear (which did have a definition) there is a word
      > definition missing in the spec.

      My personal preference is for all the references to 'Century'
      to be completely eradicated. Whether you define it or not,
      people will always get confused. I would much prefer to have
      a four digit year, with the option for more, but not less,
      digits to be included.

      I see the word 'centennial' a lot, meaning the year that ends
      in '00'. They tried hard to avoid the word 'century' in many
      places; but yes, it has crept back in. Does this still appear
      in the published ISO 8601:2000? Additionally there is a major
      review of ISO 8601 going on, which is leading to a complete
      rewrite. The 2000 version took over three years to complete.
      I wonder how long the 'full' review will take. Committees!!!!



      >> I am not sure if the '-YY' two-digit year stuff is still in
      >> the ISO 8601:2000 version or not. I will check this later.

      > -YY appears in truncated calendar dates, i.e. -YY-MM-DD,
      > see examples of section 5.2.1.3.

      5.2.1.3 does not contain an example like -YY-MM-DD in my copy.
      It does have -YYMM and -YY-MM, and -YY. It does have a YY-MM-DD,
      but note the lack of a leading hyphen on this one. It is shown
      as YY-MM-DD not as -YY-MM-DD. You have misquoted the standard.

      In all these formats: -YYMM and -YY-MM, and -YY, the hyphen
      does replace the missing two digits of the 'century'. In
      YY-MM-DD it has been left out, as per para 4.6.

      In order to understand a format like --MM-DD, you also need to
      realise that the third hyphen is a separator, but the first two
      each sit in place of a missing element, in this case, the
      'Century', and the Year. The Basic format (Basic Formats do not
      have separators) equivalent for this, is --MMDD as you can see.
      The hyphens here, replace both the Century and the Year.



      >> Perhaps other people on this list may like to add
      >> further comments and corrections to what I have said.

      > I'd like to hear from others, I think this note should be
      > re-written to discuss that there is a difference between...

      Section 4.6 refers. 'These leading hyphens may be omitted in
      the applications where there is no risk of confusing these
      representations with others'.

      05-05-05 has to be YY-MM-DD. It cannot be mistaken for anything
      else, but you could write -YY-MM-DD, the initial hyphen replacing
      the 'Century'. However a date written like 05-05 could be either
      YY-MM or MM-DD. This is where the extra - or -- comes in, to
      clarify. In fact, '05-05' is not allowed.



      > 5.2.1.3 (a) YY-MM-DD <-- no dash for missing numeric century

      Yes, both 10-10-10 and -10-10-01 mean the same, and cannot be
      confused with anything else, so the initial hyphen is allowed
      to be missed out.



      > 5.2.1.3 (b) -YY-MM <-- dash for missing numeric century

      Yes, because 05-05 could be MM-DD or YY-MM. The -05-05 clarifies
      it as -YY-MM. Para 5.2.1.3 (d) then clarifies --MMDD or --MM-DD
      for the Month and Day representation; two elements missing, as
      I said before, hence two hyphens. 05-05 isn't allowed at all,
      because the Basic format would have to be 0505. This would
      conflict with YYYY which would really be the Year AD 505. Only
      -YY-MM, -YYMM, --MM-DD, and --MMDD are therefore allowed.

      Additionally, if you think about a format like 98-12-30, you can
      remove the separators to make 981230, and this is still obviously
      YYMMDD (981230 isn't allowed to be YYYYMM), but if you allow 20-10
      as MM-DD, then removing the hyphen separator would give 2010,
      which could variously be read as either YYYY or MMDD or YYMM.
      This is why 20-10 is not a valid format. It has to be -20-10 for
      -YY-MM and --20-10 for --MM-DD, and the 2010 is always YYYY.



      > 5.2.2.2 (a) YY-DDD <-- no dash for missing numeric century

      Yes, because in 05-005, the '005' has to be the Day of the Year,
      there is absolutely no other possibility. Therefore, the element
      before that has to be a two-digit year. The leading hyphen can
      therefore be omitted, exactly as per the YY-MM-DD example, above.
      That is, -YY-DDD is unecessary, YY-DDD will suffice.



      > and then the claim that
      > -DDD should be --DDD which is like 5.2.1.3 (b), but not like
      > the others, so I'm still suspicious of this 'should' idea.

      So, '005' has to be the Day of the Year. In principle it should
      be '--005', but '-005' means exactly the same, as per Para 4.6.
      ISO require that something is put in place of the missing year,
      but do not insist on having two hyphens, as per para 4.6.

      I know this to be right. See 5.2.1.3 (e) for --MM and
      5.2.1.3 (f) for ---DD in exactly the same way. Here,
      a number like '05' could be a 'Century', a Year, a Month,
      or a Day. The number of leading hyphens clarifies exactly
      what it is. Each hyphen is for a missing element. None
      are separators.

      So, -05 is Year 05, --05 is Month 05, and ---05 is Day 05.
      Day 005 of Year should be --005 but -005 will suffice, because
      with three digits it cannot possibly be anything else; remember
      the rule about leading zeroes. The number 05 on its own is
      always 'Century' 05, that is, approx 1500 years ago.

      Exactly the same argument applies to 5.2.3.3 NOTE 2.

      Notice also Para 5.2.1.2 (a) and 5.2.1.3 (a) Basic Format in
      sorting out whether 121212 is YYYYMM or YYMMDD... Here,
      YYYYMM is not allowed, but YYMMDD is allowed. The YYYY-MM
      format has to be used for Year and Month, otherwise you would
      not know what 121212 means. So 121212 is YYMMDD, 12-12-12
      is YY-MM-DD, and 1212-12 is YYYY-MM. 12-1212 is also not valid.

      I never use left truncated dates. I always use the full
      YYYY-MM-DD, or a reduced precision (right truncated) date like
      YYYY or YYYY-MM.



      > Someone want to reconcile this difference?

      OVER TO YOU! There are over 100 people out there reading this.
      What say all of you? Am I right? Dan Kohn? Fred Bone? Pete
      Forman? Aron Roberts?



      >> Corrections have to go via your national standards body, and
      >> you usually have to be a 'member'. ISO do not take comments
      >> directly from 'the public'.

      >> This is not a typo, in my opinion.

      > No, I didn't think it was. I think the note is left over from
      > an earlier edition -- from the time of CC -- which needs to be
      > re-worked.

      As this is a draft edition, I would want to see the REAL version
      of ISO 8601:2000 before commenting further.



      > My interest in typos is for the simple ones elsewhere. As I
      > said:

      >>> Also, to where or whom do I send simple typo corrections?

      > Note the word "also". That sentence was a separate question.
      > Sorry if that wasn't clear enough.

      You actually said:

      >>>> Also, were to where or whom do I send simple typo corrections?

      I saw the word 'also' but I didn't see any word like 'other'
      between 'send' and 'simple', if you want to do pedandic; and
      the initial 'were' also threw me.



      >> You could try writing to the person who wrote the message
      >> shown in the web page I quoted above,

      > Some guy in Switzerland who has the title
      > in Mr. Fran├žois VUILLEUMIER, President of
      > the Subgroup on Co-operation Central and
      > East European Countries

      > And you expect him to jump protocol and field questions from me?
      > Is this just a guess on your part or do you know him to take
      > trivial things like typos in the last draft, or clarifications
      > on the reading the actual text.

      He has an email address. Write and ask him. Don't ask, won't know.
      The web site I referred to, is one that was developing ebXML, a
      variation of XML for use in Electronic Business. Fran├žois
      VUILLEUMIER is on the ISO committee responsible for ISO 8601.



      > Anyone know the USA rep, or the list of IEEE committee members?

      The ANSI or NIST Web Sites should direct you to that information.



      >> or to Louis Visser, at
      >> NNI, who was in charge of the update to the ISO 8601 standard.

      > That would be: louis.visser@...

      I first wrote to him three years ago with a bundle of comments.
      He was very helpful. He did say that my comments could not go
      directly into the ISO committee proceedings, but would be
      discussed informally elsewhere.



      > Meanwhile the document itself lists:
      > "e-mail: <jean.kubler@...>"

      Try that one as well. What is there to lose? If they can't
      help you then I would hope they would point you towards someone
      who can. Maybe one of the 100-plus people receiving this email
      has a better suggestion?



      Cheers,

      Ian.


      <mail://g1smd@...>

      <http://www.qsl.net/g1smd/>
      <http://home.freeuk.net/g1smd/>
      <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

      <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
      <ftp://ftp.qsl.net/pub/g1smd/>


      [2001-07-14]

      .end
    • P A Hill & E V Goodall
      ... This is probably why the note which looks unobvious to me is obvious to you. I read 4.6 with the most important opening phrase By mutual agreement of the
      Message 2 of 15 , Jul 14, 2001
      • 0 Attachment
        g1smd@... wrote:
        > Section 4.6 refers. 'These leading hyphens may be omitted in
        > the applications where there is no risk of confusing these
        > representations with others'. Any leading hyphen would always
        > be to replace a missing element. Hyphens between elements are
        > separators in Extended Formats. There are no separators in the
        > Basic formats. There are never any separators before the first
        > element, only 'replacement' hyphens for missing elements.

        This is probably why the note which looks unobvious to me is
        obvious to you. I read 4.6 with the most important opening
        phrase "By mutual agreement of the partners in information interchange"
        Thus, it provides a way in specific applications of this standard to drop
        something this is stated in the standard. I don't see 4.6 as suggesting that it
        is the rationale which was used to come up with the formats which are in standard.

        > It's all in
        > paragraph 4.6 as far as I can see.

        It says you can drop what is there, but it doesn't say that the full representation
        would treat the hundreds part of the year as a separate component from
        the tens and ones of the year. Only a few examples hint at that, but not
        all of them do and not all exceptions are noted.

        > Does this still appear
        > in the published ISO 8601:2000?

        I wouldn't know, I don't have it, I just have the various free downloads.
        Hopefully this was clear. If not it should be by now.

        > > -YY appears in truncated calendar dates, i.e. -YY-MM-DD,
        > > see examples of section 5.2.1.3.
        >
        > You have misquoted the standard.

        Sorry, my mistake. It should have read "-YY-MM".

        > In all these formats: -YYMM and -YY-MM, and -YY, the hyphen
        > does replace the missing two digits of the 'century'. In
        > YY-MM-DD it has been left out, as per para 4.6.

        Again, that is not what a pedantic read of 4.6 says. 4.6 says me and
        who ever I communicate with can cut what we see even further when we are
        only using some agreed upon subset of everything, it doesn't
        provide a rationale for what is in the standard.

        > > 5.2.2.2 (a) YY-DDD <-- no dash for missing numeric century
        >
        > Yes, because in 05-005, the '005' has to be the Day of the Year,
        > there is absolutely no other possibility. Therefore, the element
        > before that has to be a two-digit year. The leading hyphen can
        > therefore be omitted, exactly as per the YY-MM-DD example, above.
        > That is, -YY-DDD is unecessary, YY-DDD will suffice.

        It is too bad that 4.6 doesn't actually introduce the idea that
        the writers of the standard used the idea as you
        claim, to come up with their various formats.

        Maybe, some discussion at 5.2.1 ... Year would actually set at least
        me in the right mind set.

        Also, if your suggestion as to the design is the case, I would expect a note
        like the one I was surprised to see in 5.2.2.2 after 5.2.1.3 (a)
        and the note at 5.2.3.3 noting all of the variations which are
        not "fully hyphenated". This would make them all consistent.
        The note at 5.2.3.3 would not list only
        one format, but mention all of those which one might think might have
        a leading dash for missing 'century' and another for missing year pointing
        out the simplification.

        In fact, the standard before the first truncated format in the opening paragraph
        of 5.2.1.3 says "In each case hyphens that indicate components should be used only
        as indicated or shall be omitted."

        That to me hints that some of choices are arbitrary, so don't play around with
        them. Also, there is no place that states that all formats are mutually
        unambiguous from each other. As I was reading I was looking for just such
        a statement or examples that violated the idea. I found neither, but that
        is no proof.

        Hopefully this can all be clarified in the next edition.

        > OVER TO YOU! There are over 100 people out there reading this.
        > What say all of you? Am I right? Dan Kohn? Fred Bone? Pete
        > Forman? Aron Roberts?

        The question is not whether you are right, it is a question of
        meaning of the standard. Or another way to put it: You may be right,
        and I have no reason to think you aren't, but the standard is still
        not clear on where the "should" comes from.

        I personally am now convinced by what you have provided that
        I was misled by the facts that the very first truncated example
        YYMMDD doesn't have a leading dash, doesn't have a note which
        points this out and there is nothing up to that point which says
        what the expected style is, and there is no consistency in the following
        examples, so when I finally get to 5.2.2.2 "note: ... should be ..."
        I go back and read looking for something that tells me what should
        be anywhere and all I see are various examples without explanation
        that any are exceptions to any expectations. Thus reading the
        standard does not make me think there is any "should" involved other
        than the examples as given. That is the source of my question
        about this note.

        > You actually said:
        >
        > >>>> Also, were to where or whom do I send simple typo corrections?

        Gee, this is a useful comment, who wasn't going to discuss each
        word and comma? :-(

        > > Anyone know the USA rep, or the list of IEEE committee members?
        >
        > The ANSI or NIST Web Sites should direct you to that information.

        Yes of course it should, (not that there is an ISO standard defining that
        it should! :-) I already tried that and didn't find it.

        > Try that one as well. What is there to lose? If they can't
        > help you then I would hope they would point you towards someone
        > who can. Maybe one of the 100-plus people receiving this email
        > has a better suggestion?

        Any suggestion in my opinion is probably better than just
        pointing out that there are some e-mail addresses. I was actually
        hoping for something useful not just the obvious. I have noted
        your comments with regard to Louis Visser.

        -Paul
      • Pete Forman
        ... How about the last paragraph in 4.1. I agree that the choices seem arbitrary. There may be some logic behind it though. My guess is that the rules are
        Message 3 of 15 , Jul 16, 2001
        • 0 Attachment
          P A Hill & E V Goodall writes:
          > [snip]
          > In fact, the standard before the first truncated format in the
          > opening paragraph of 5.2.1.3 says "In each case hyphens that
          > indicate components should be used only as indicated or shall be
          > omitted."
          >
          > That to me hints that some of choices are arbitrary, so don't play
          > around with them. Also, there is no place that states that all
          > formats are mutually unambiguous from each other. As I was reading
          > I was looking for just such a statement or examples that violated
          > the idea. I found neither, but that is no proof.

          How about the last paragraph in 4.1.


          I agree that the choices seem arbitrary. There may be some logic
          behind it though. My guess is that the rules are something like

          Replace an omitted component in a truncated format with a hyphen.
          Component may be century (first two digits of a four digit year)
          or decade (first three digits of a four digit year) or last two
          digits of the year or month or week. The term component is not
          defined as such but the components are listed for each of the
          truncated representations.

          Remove hyphens if the result is unambiguous.
          4.6 para 1 states that a hyphen may be necessary to represent an
          omitted component. That implies to me that the hyphen should not
          be used if possible.

          (The above two rules may also be expressed as: Components may be
          omitted, if the result is ambiguous then use a hyphen to stand for
          the omitted component.)

          Add hyphens where a format is ambiguous. This is bound to be
          arbitrary: if two formats collide one must be chosen to get the
          extra hyphen.


          Note that omitting hyphens by these rules is a separate issue to
          5.2.1.3. I take the latter to mean that the communicating parties
          agree that, for example, two digits alone mean a month rather than
          using four characters of 5.2.1.3.e proper.
          --
          Pete Forman -./\.- Disclaimer: This post is originated
          WesternGeco -./\.- by myself and does not represent
          pete.forman@... -./\.- opinion of Schlumberger, Baker
          http://www.crosswinds.net/~petef -./\.- Hughes or their divisions.
        • Douglas_Luthanen@Cargill.com
          How may I be deleted from this mailing list? Respectfully, dwl Doug Luthanen 501 750-6840 ... From: goodhill@xmission.com [mailto:goodhill@xmission.com] Sent:
          Message 4 of 15 , Jul 16, 2001
          • 0 Attachment
            How may I be deleted from this mailing list?
            Respectfully,

            dwl
            Doug Luthanen
            501 750-6840


            -----Original Message-----
            From: goodhill@... [mailto:goodhill@...]
            Sent: Thursday, July 12, 2001 1:50 PM
            To: iso8601@yahoogroups.com
            Subject: [ISO8601] Clarifications: 5.2.2.2


            To test this group, and my ability to post to I will ask a simple
            question.
            I was reading ISO/TC 154 N 362
            ISO8601:2000(E)2000-12-19

            Ordinal Date
            Section 5.2.2.2 Truncated Version

            <quote>
            Day only in the implied year
            Basic format: -DDD EXAMPLE -102
            Extended format: not applicable

            NOTE Logically, the representation should be [--DDD], but the first
            hyphen is
            superfluous and, therefore, it has been omitted.
            </quote>

            If I understand this correctly, they are referring to the "not
            applicable" Extended format which
            the standard has left out. The first - would be the missing (i.e.
            truncated) year the
            second would be the separator
            dash between the year and the day.

            Do I understand the intent of the dashes and why "the representation
            should be [--DDD]"?

            Also, were to where or whom do I send simple typo corrections?

            -Paul




            Your use of Yahoo! Groups is subject to
            http://docs.yahoo.com/info/terms/
          • g1smd@amsat.org
            On 2001-Jul-14 Paul Hill wrote: [2001-Jul-16] These comments are concerning the text of ISO/TC 154 N 362 [PDF] document, which is
            Message 5 of 15 , Jul 16, 2001
            • 0 Attachment
              On 2001-Jul-14 Paul Hill <goodhill@...> wrote:


              [2001-Jul-16]



              These comments are concerning the text of ISO/TC 154 N 362 [PDF]
              document, which is the final draft version of the ISO 8601:2000
              standard. The final published version of ISO 8601:2000 is still
              not available online, but this draft (from only four days
              previously) can still be downloaded from [PDF00005.PDF]:
              <http://lists.ebxml.org/archives/ebxml-core/200104/msg00252.html>.



              >> Section 4.6 refers. 'These leading hyphens may be omitted in
              >> the applications where there is no risk of confusing these
              >> representations with others'. Any leading hyphen would always
              >> be to replace a missing element. Hyphens between elements are
              >> separators in Extended Formats. There are no separators in the
              >> Basic formats. There are never any separators before the first
              >> element, only 'replacement' hyphens for missing elements.

              > This is probably why the note which looks unobvious to me is
              > obvious to you. I read 4.6 with the most important opening
              > phrase "By mutual agreement of the partners in information
              > interchange" Thus, it provides a way in specific applications
              > of this standard to drop something this is stated in the
              > standard. I don't see 4.6 as suggesting that it is the
              > rationale which was used to come up with the formats which
              > are in standard.

              I am very familiar with the various allowed formats, and
              variations, and being familiar with all of that, it is very
              easy for me to miss a point or hint in the wording, or to know
              something which although it isn't actually stated in the standard,
              is actually the way things are done. I do see that you have a
              point here. They state that the leading zeroes *may* be dropped,
              then go on to just automatically drop them, in some of the
              examples, without putting a note against some of them. That is
              not very good. Each and every time they do this, it does need an
              extra note or clarification. However, is para 4.9 perhaps a
              poorly worded way of trying to tell us about this?



              >> It's all in paragraph 4.6 as far as I can see.

              > It says you can drop what is there, but it doesn't say that
              > the full representation would treat the hundreds part of the
              > year as a separate component from the tens and ones of the
              > year. Only a few examples hint at that, but not all of them
              > do and not all exceptions are noted.

              I agree that this is a bit sloppy. It needs rewriting, or some
              additional notes. The minimum required, would be to state that
              the year may be specified by two, or by four, or by more
              digits; and I see a problem here... 121212 is assumed to be
              YYMMDD, but this could be the YYYYYY 121212. Having re-read the
              standard I see that para 4.7 does cover this. Additionally,
              para 4.8 does say that elements do all have a defined length,
              and that leading zeroes must be used to fulfil this. It
              doesn't fully answer your point, so I guess their wording
              needs improving.



              >> Does this still appear in the published ISO 8601:2000?

              > I wouldn't know, I don't have it, I just have the various
              > free downloads. Hopefully this was clear. If not it should
              > be by now.

              I am still waiting for someone on this email list to compile
              a note of all of the changes between the 2000-Dec-19 draft
              and the 2001-Jan-24 final published version of ISO 8601.
              Any volunteers?



              >>> -YY appears in truncated calendar dates, i.e. -YY-MM-DD,
              >>> see examples of section 5.2.1.3.

              >> You have misquoted the standard.

              > Sorry, my mistake. It should have read "-YY-MM".

              No problems. Typos happen, but it did hinder your argument a
              bit. Here, the first hyphen is replacing the 'old' 'CC' and
              the second hyphen is the separator. The Basic format of this
              is -YYMM, where the hyphen again replaces the 'old' 'CC'.
              In 'YY-MM-DD' ISO automatically dropped the leading hyphen,
              as a date like '12-12-12' cannot possibly be anything other
              than 'YY-MM-DD'. Is this hinted at in 4.9?



              >> In all these formats: -YYMM and -YY-MM, and -YY, the hyphen
              >> does replace the missing two digits of the 'century'. In
              >> YY-MM-DD it has been left out, as per para 4.6.

              > Again, that is not what a pedantic read of 4.6 says. 4.6 says
              > me and who ever I communicate with can cut what we see even
              > further when we are only using some agreed upon subset of
              > everything, it doesn't provide a rationale for what is in the
              > standard.

              As above, I do agree that the wording in 4.6 provides a reason
              for this, but does not provide a complete rationale of why this
              is done. I hope ISO clarifies it in the next edition. I do now
              see what your point is; that you *may* drop the hyphen, but
              for some reason, with YY-MM-DD and YYMMDD, ISO have *already*
              dropped it, without clearly saying why. However, is para 4.9
              perhaps a poorly worded way of trying to tell us about this?



              >>> 5.2.2.2 (a) YY-DDD <-- no dash for missing numeric century

              >> Yes, because in 05-005, the '005' has to be the Day of the
              >> Year, there is absolutely no other possibility. Therefore,
              >> the element before that has to be a two-digit year. The
              >> leading hyphen can therefore be omitted, exactly as per the
              >> YY-MM-DD example, above. That is, -YY-DDD is unnecessary,
              >> YY-DDD will suffice.

              > It is too bad that 4.6 doesn't actually introduce the idea that
              > the writers of the standard used the idea as you claim, to come
              > up with their various formats.

              Extra notes would be useful; but I also think there is a 'logic
              error' or 'precedence error' in providing some of the default
              formats. I'll explain more at the end of this message. I've
              hinted at it with the note about YYMMDD and YYYYYY above; but
              it also concerns YYYYMM.



              > Maybe, some discussion at 5.2.1 ... Year would actually set at
              > least me in the right mind set.

              > Also, if your suggestion as to the design is the case, I would
              > expect a note like the one I was surprised to see in 5.2.2.2
              > after 5.2.1.3 (a) and the note at 5.2.3.3 noting all of the
              > variations which are not "fully hyphenated". This would make
              > them all consistent.

              > The note at 5.2.3.3 would not list only one format, but mention
              > all of those which one might think might have a leading dash
              > for missing 'century' and another for missing year pointing
              > out the simplification.

              Now I see what you are saying, I agree that the wording here
              is sub-optimal. You reach a place where you see a format you
              were not expecting, with no previous rationale as to why the
              format is shown like it is. Yes, the standard is deficient
              (unless 4.9 is where its at?) and requires extra notes.



              > In fact, the standard before the first truncated format in the
              > opening paragraph of 5.2.1.3 says "In each case hyphens that
              > indicate components should be used only as indicated or shall
              > be omitted."

              That paragraph, along with 4.6, and now seeing that several notes
              against examples are obviously missing, when all read together
              do raise some doubts. I agree that the wording is poor.



              > That to me hints that some of choices are arbitrary, so don't
              > play around with them. Also, there is no place that states
              > that all formats are mutually unambiguous from each other.

              I knew that last statement to be true (mutually unambiguous),
              but it takes a bit of tracking down to find those words in the
              very last part of 4.1 '... unique and unambiguous'; but is
              that wording as strong as '*mutually* unambiguous'? I think
              that it probably is.

              I have never considered any of the defined formats to be
              'arbitrary' choices, but now that I have condensed all of the
              standard down to a short table, further on, that table does
              appear to show that to be the case for several of the formats,
              most notably with YYMMDD.



              > As I was reading I was looking for just such a statement or
              > examples that violated the idea. I found neither, but that
              > is no proof.

              I now see what you are trying to say; and I think I have found
              one... YYMMDD vs YYYYYY (excepting the note in 4.7). Also, I
              think that YYMMDD should have been disallowed in favour of
              YYYYMM, which is not currently permitted. See the table below.



              > Hopefully this can all be clarified in the next edition.

              >> OVER TO YOU! There are over 100 people out there reading this.
              >> What say all of you? Am I right? Dan Kohn? Fred Bone? Pete
              >> Forman? Aron Roberts?

              > The question is not whether you are right, it is a question of
              > meaning of the standard. Or another way to put it: You may be
              > right, and I have no reason to think you aren't, but the
              > standard is still not clear on where the "should" comes from.

              I wasn't sure why you were hung up on this one word 'should'.
              Now you have explained more, then I am happy to agree with you.
              You are right. Although the standard works the way I have said,
              and the examples follow the method I have stated, nowhere in
              the standard does it state clearly that this is the case, or
              why it should be so, and several notes of clarification are
              obviously missing on a few examples. It takes someone else
              reading it 'fresh' to spot these errors. I am too 'familiar'
              with how it works to pick up a fundamental error like that.
              If you know how something already works, then it isn't always
              readily obvious that some little note or clarification is
              actually missing. There are 'hints' in 4.6 and 4.9, but not
              enough to satisfy, now that I have read it about 6 times.



              > I personally am now convinced by what you have provided that
              > I was misled by the facts that the very first truncated example
              > YYMMDD doesn't have a leading dash, doesn't have a note which
              > points this out and there is nothing up to that point which
              > says what the expected style is, and there is no consistency
              > in the following examples, so when I finally get to 5.2.2.2
              > "note: ... should be ..." I go back and read looking for
              > something that tells me what should be anywhere and all I see
              > are various examples without explanation that any are
              > exceptions to any expectations. Thus reading the standard
              > does not make me think there is any "should" involved other
              > than the examples as given. That is the source of my question
              > about this note.

              Fully understood. Now you have explained it, I am tending to
              agree with you. Your point was not that there was an error in
              the format they were suggesting to use, but that there was no
              notes to explain why it should be formatted that way, when you
              were actually expecting to see something else there, after
              following the 'logic' of the previous few paragraphs. Para
              4.9 may be referring, but it isn't obvious or clearly worded.
              I keep saying *may* because I don't really know if it is, or
              if it isn't.



              I have checked the next part of this message very thoroughly
              but due to the huge complexity in compiling it, I cannot
              guarantee that I caught all of the initial typing errors.



              In the next part, I have used a Year of 1212, Month of 12,
              and Day of 12, so that you are not influenced by digits like
              '99' seemingly assuring you these are the last two digits of
              a year... when in fact two digits stated on their own are
              actually for the FIRST two digits of the year (unless by
              'mutual agreement', etc).



              This next part needs to be viewed using a NON-proportional
              typeface, so that it aligns in columns. Copy and paste to a
              Word Processor, if necessary, in order to achieve this.



              There is an inconsistency in the standard, which becomes
              obvious when I list the allowed and non-allowed formats in
              a table, like this (non-allowed are marked with x here;
              'allowed only by mutual agreement' are marked with z; and
              the '!' marking means 'see notes'; read the left and right
              half separately):


              YYYY-MM-DD YYYY-MM-DD YYYYMMDD YYYYMMDD
              ---------- ---------- -------- --------

              a YY 12 YY 12 a
              b YYYY 1212 YYYY 1212 b
              c YYYY-MM 1212-12 x! YYYYMM 121212 c
              d YYYY-MM-DD 1212-12-12 YYYYMMDD 12121212 d
              e -YY -12 -YY -12 e
              f z YY 12 z YY 12 f
              g -YY-MM -12-12 -YYMM -1212 g
              h z YY-MM 12-12 z YYMM 1212 h
              i x! -YY-MM-DD -12-12-12 x! -YYMMDD -121212 i
              j ! YY-MM-DD 12-12-12 ! YYMMDD 121212 j
              k --MM --12 --MM --12 k
              l x -MM -12 x -MM -12 l
              m z MM 12 z MM 12 m
              n --MM-DD --12-12 --MMDD --1212 n
              o x -MM-DD -12-12 x -MMDD -1212 o
              p z MM-DD 12-12 z MMDD 1212 p
              q ---DD ---12 ---DD ---12 q
              r x --DD --12 x --DD --12 r
              s x -DD -12 x -DD -12 s
              t z DD 12 z DD 12 t

              An additional note must also state that leading hyphens only
              replace elements, and are never separators, for this to work.
              Also note that one digit elements, and three, five and seven
              digit formats are not allowed (unless, I suppose 'by mutual
              agreement...'). In any case, a one digit element would always
              have to be the most left hand element in the expression (e.g.
              YYYMM, YMM, YMMDD) except for showing a decade ('197' for
              the 1970s?), but these are 'horrible' structures, the latter
              especially possibly being almost outside the scope of the
              ISO 8601 standard.


              Notes and Explanations:
              -----------------------

              LEFT: RIGHT:

              Not allowed (x left), because c - In my opinion, this format
              (except z by mutual agreement): should be allowed, not j !
              f - used by a, confuse with m t f - used by a, confuse with m t
              h - could be confused with p h - used by b, confuse with p
              i - logical, but ISO use j i - logical, but ISO use j
              j - allowed!! In my opinion, j - allowed!! In my opinion
              it should not be allowed. it should not be allowed.
              l - used by e, confuse with s l - used by e, confuse with s
              m - used by a, confuse with f t m - used by a, confuse with f t
              o - format used by g o - format used by g
              p - could be confused with h p - used by b, confuse with h
              r - format used by k r - format used by k
              s - used by e, confuse with l s - used by e, confuse with l
              t - used by a, confuse with f m t - used by a, confuse with f m


              I say 'allowed!!' against the LEFT 'j' format only because
              it is the only 'right-justified' Extended date format that
              is allowed (without a leading hyphen to show the missing
              elements), and it is out of place. On the RIGHT side,
              YYMMDD (j), which is also allowed, will therefore conflict
              with YYYYYY. In my opinion, usage of YYMMDD in the standard
              is NOT correct. They should have allowed the YYYYMM format
              instead. It is strange that the YYYYMM (RIGHT c) date format
              is not allowed, as it totally breaks the logic of the table,
              if you are looking for a pattern. The pattern to me is that
              dates are 'left justified', with hyphens in place of missing
              left elements, one hyphen per two digits omitted (which is a
              definition that neatly avoids having to use a word like
              'century'; as long as it is also stated that the year can be
              two or four digits; or more digits, just as long as only two
              at a time are added), and that reduced precision simply
              deletes digits two at a time from the right of the date.

              Now, looking at the standard condensed to this table, it is
              obvious that the problems also occur when there is a format
              that is allowed in the left column, but disallowed in the
              right column (e.g. c), as the standard does not provide
              enough information to support why this is done. To me that
              is an error.

              Also, the 'mutual agreement' problem appears here again.
              Representations that have the prescribed leading hyphens
              omitted can be used only by mutual agreement... except
              that the format at j seems to be the default, rather than
              the format stated in i. That is, for all the others, mutual
              agreement is required, but for j it has already been
              forced upon us to agree to this. In doing this, you get the
              'logic error' with the RIGHT c entry being disallowed (in
              order to satisfy the {non written, as far as I can see} rule
              that any representation can have only one implied meaning
              unless mutual agreement has already been obtained).

              I guess they had to include YYMMDD and exclude YYYYMM simply
              because millions of computer systems were already using YYMMDD.
              However that probably helped people to avoid thinking about
              Y2K problems for far longer than they should have done.
              1988 would actually have been early enough for every version
              of Windows (3.x onwards) to be completely free of all such
              problems, for example.



              This next part needs to be viewed using a NON-proportional
              typeface, so that it aligns in columns.



              The table can be rearranged to ask what a numerical format
              should be decoded as. To keep it simple, I have not divided
              it into Basic and Extended formats. Anything with a hyphen
              between elements is an Extended format. Writing the table
              this way, I have included some formats that the ISO standard
              says are 'Not Applicable'. There cannot be a way to tell if
              '1950' is supposed to be a Basic format Year or an Extended
              format Year. I have ignored this and included it under both
              styles. The table produces (again, a, b, etc, refer to notes
              after) the following result:


              ALLOWED DISALLOWED
              ------- ----------

              1212-12-12 YYYY-MM-DD
              1212-12 YYYY-MM
              1212 YYYY z YYMM MMDD
              12 YY (19 of 1950) z YY (50 of 1950) MM DD

              12121212 YYYYMMDD
              121212 a YYMMDD ! x! YYYYMM !
              1212 YYYY z YYMM MMDD
              12 YY (19 of 1950) z YY (50 of 1950) MM DD

              1212-12-12 YYYY-MM-DD
              12-12-12 b YY-MM-DD !
              12-12 c n/a z YY-MM MM-DD !
              12 YY (19 of 1950) z YY (50 of 1950) MM DD

              -12-12-12 d n/a ! x! -YY-MM-DD (use YY-MM-DD) !
              -12-12 -YY-MM x -MM-DD
              -12 -YY (50 of 1950) x -MM -DD

              -121212 e n/a ! x! -YYMMDD ! (use YYMMDD) !
              -1212 -YYMM x -MMDD
              -12 -YY (50 of 1950) x -MM -DD

              --12-12 --MM-DD
              --12 --MM x --DD

              --1212 --MMDD
              --12 --MM x --DD

              ---12 ---DD

              12-1212 n/a x Not allowed at all.


              Notes and Explanations:
              -----------------------

              At a, I wish that 121212 were really YYYYMM, not YYMMDD.

              People use b, but logically the full date is -12-12-12.
              It would be useful if both b and YYMMDD were disallowed, or
              at least reverted to the 'use by mutual agreement' status.

              At c, I am glad that both are not valid, but logically since
              YY-MM-DD was allowed at b, then 12-12 would have to be MM-DD
              (both would then be 'right justified'). However these would be
              the only two 'right justified' dates in the table. Everything
              else is 'left justified, with a hyphen to replace each missing
              pair of digits'. So really it is b that breaks this unwritten
              rule. I wonder if ISO realise what they have done?

              At d and e, these are the formats you would logically expect
              to see being used, but ISO just automatically dropped the
              leading hyphen, producing the formats at a and b instead.
              By breaking the 'pattern' you are correct to say that some
              choices of format now appear to be arbitrary.

              And, z, just confirms the old 'by mutual agreement you can
              drop leading hyphens for any of these date formats' rule,
              and use all the meanings that I have currently placed in the
              'disallowed' column if you need to do so.

              If a line has an 'x', then entries in the 'disallowed' column
              to the right of the 'x' are never allowed to be used.

              For anything marked 'n/a !' in the allowed column, the '!'
              points to surprise at the blank; that ISO have disallowed
              what you would logically expect to be there.

              A date format in the allowed column, with a '!' included, you
              would logically expect to be disallowed, but ISO included it.

              Having allowed YYMMDD in, you would expect YY on its own to
              simply be the 50 of 1950, but it actually represents the 19
              part.

              Logically, both of the entries at ALLOWED 'a' and ALLOWED 'b'
              should really be 'by mutual agreement' formats, but ISO chose
              not to do this.

              From this, it is now obvious that ISO have 'jumped the gun'
              by automatically deleting leading hyphens on some formats
              that would be logically expected to have one or more leading
              hyphens, and have not provided a note to say that this is
              what has happened (unless 4.6 and 4.9 is the hint).



              Another part of the problem is that the date is written as
              4-2-2. If there were a separator between the 'Century' and
              the two digit Year, then semantic rules would be a lot easier.

              Today would be 20-01-07-15. So, leaving off left elements
              would give -01-07-15, --07-15, ---15, and leaving off right
              elements would give 20-01-07, 20-01, and 20. It's the fact
              that the Year is four digits, and the other elements are
              only two digits that skews the pattern; as well as the
              inclusion of YYMMDD in the default standard, rather then
              YYYYMM. For Basic format dates read 20010715, -010715,
              --0715, ---15, 200107, 2001, and 20 noting that here 200107
              is YYYYMM, because YYMMDD is actually done by using -YYMMDD.
              That IS consistent. ISO Ver 3?



              Another way of solving the problem, would be to set up a
              standard where the date is a full 8 digits, and has optional
              separators, but if any pair of digits on either the left or
              right end of the date are missing then they are replaced,
              each pair of missing digits, with a hyphen (whereas ISO 8601
              only ever places hyphens on the left side of a date), as
              well as there being a rule to say that there are no leading
              or trailing separators, only hyphens used as replacements.
              In other words, separators are only ever used in order to
              separate digits.


              This would produce something like:


              BASIC FORMAT EXTENDED FORMAT
              ------------ ---------------

              12121212 YYYYMMDD 1212-12-12 YYYY-MM-DD
              121212- YYYYMM- 1212-12- YYYY-MM-
              1212-- YYYY-- 1212-- YYYY--
              12--- YY--- 12--- YY---
              -121212 -YYMMDD -12-12-12 -YY-MM-DD
              -1212- -YYMM- -12-12- -YY-MM-
              -12-- -YY-- -12-- -YY--
              --1212 --MMDD --12-12 --MM-DD
              --12- --MM- --12- --MM-
              ---12 ---DD ---12 ---DD


              You can always check the date is correctly formed, by using
              a very simple set of rules... Ignore all hyphens BETWEEN
              digits. Group all digits into pairs. Count each leading
              hyphen, each digit pair, and each trailing hyphen. There
              should always be exactly FOUR units.

              A further rule would state that in order to join the date
              to a time, the date should NOT have trailing hyphens, whilst
              still having four units, i.e. the right hand end MUST be
              the Day Number digits.

              This only works for a four digit year system. For all years
              beyond 9999, the standard has to be rewritten to add formats
              with the Year stated as six digits, an extra leading hyphen
              to be added to every one of the above formats, and with the
              format checking rule updated to say that there are now FIVE
              units, rather than only four.

              I don't propose this as a replacement to ISO 8601, but just
              mention it as all these points apply to the current standard
              in some way.



              The more I look at things, the more I am convinced that
              including YYMMDD rather than YYYYMM is the root cause of
              most of the problem. This looks like being a Y2K problem
              in another guise. The original standard was written well
              before the Year 2000. Including YYMMDD simply pandered
              to the, then current, vogue for a 'default' of a two-digit
              year, without thought for the overall logic of the whole
              document. The new standard makes reference to using all
              four digits for the Year to avoid these sorts of problems,
              but this could also have helped in revising the logic of
              the standard... by suggesting that users go back to using
              -YYMMDD in preference to just YYMMDD alone, or, even
              better still, just adopt the full four digit year. This
              would then allow 121212 to be 'YYYYMM' as originally, and
              logically, expected.



              Now, let's re-do those first two tables above for the Ordinal
              Day of Year Format. Note that I have repeated formats like
              YY and YYYY from the first table for completeness; you can't
              tell if YYYY is part of a 'normal' Gregorian Calendar date,
              an Ordinal Date, or a 'Week Number and Day of Week' date, so
              I repeat them here, whereas the official standard does not:


              YYYY-DDD YYYY-DDD YYYYDDD YYYYDDD
              -------- -------- ------- -------

              a YY 12 YY 12 a
              b YYYY 1212 YYYY 1212 b
              c YYYY-DDD 1212-121 YYYYDDD 1212121 c
              d -YY -12 -YY -12 d
              e z YY 12 z YY 12 e
              f x! -YY-DDD -12-121 x! -YYDDD -12121 f
              g ! YY-DDD 12-121 ! YYDDD 12121 g
              h x! --DDD --121 x! --DDD --121 h
              i ! -DDD -121 ! -DDD -121 i
              j z DDD 121 z DDD 121 j
              k x (-)(-)DD 12 x (-)(-)DD 12 k
              l x (-)(-)D 1 x (-)(-)D 1 l

              An additional note must also state that leading hyphens only
              replace elements, and are never separators, for this to work.
              Again 'z' refers to 'mutual agreement' formats, and 'x' means
              not permitted.


              Notes and Explanations:
              -----------------------

              LEFT: RIGHT:

              c - complete c - 7 digits: unambiguous.
              e - used by a e - used by a
              f - is the logical default, but f - is the logical default, but
              g - does not conflict with g - does not conflict with
              anything else anything else (5 digits).
              h - logical default, but h - logical default, but
              i - ISO dropped one leading '-' i - ISO dropped one leading '-'
              j - mutual agreement, but three j - mutual agreement, but three
              digits are unambiguous anyway digits are unambiguous anyway
              k - not allowed, must be 3 digits k - not allowed, must be 3 digits
              l - not allowed, must be 3 digits l - not allowed, must be 3 digits

              The Year can be two or four digits (with a leading hyphen
              mandatory for some two digit formats), the Day of Year must
              be three digits.

              You would expect 'g' to be by 'mutual agreement' and for
              'i' to be 'not permitted' if you compare this table with
              the one for the Gregorian Calendar Date, but 4.6 and 4.9
              override this. Entry 'j' is still 'by mutual agreement'
              as would be expected.



              The table can be rearranged to ask what a numerical format
              should be decoded as. To keep it simple, I have not divided
              it into Basic and Extended formats. Anything with a hyphen
              between elements is an Extended format. Writing the table
              in this way, I have included some formats that the ISO
              standard says are 'Not Applicable'. There cannot be a way
              to tell if '1950' is supposed to be a Basic format Year or
              an Extended format Year. I have ignored this and included
              it under both styles. The table produces (again, a, b, etc,
              refer to notes after) the following result:


              ALLOWED DISALLOWED
              ------- ----------

              1212-121 YYYY-DDD
              1212 YYYY x YDDD
              12 YY (19 of 1950) z YY (50 of 1950)

              1212121 YYYYDDD
              1212 YYYY x YDDD
              12 YY (19 of 1950) z YY (50 of 1950)

              1212-121 YYYY-DDD
              12-121 a YY-DDD !
              121 n/a ! z DDD x YYY

              1212121 YYYYDDD
              12121 b YYDDD !
              121 n/a ! z DDD x YYY

              -12-121 c n/a ! x! -YY-DDD (use YY-DDD) !
              -121 e -DDD ! x -YYY (-YY or YYYY)
              -12 -YY (50 of 1950) x -DD (must be DDD)

              -12121 d n/a ! x! -YYDDD ! (use YYDDD) !
              -121 e -DDD ! x -YYY (-YY or YYYY)
              -12 -YY (50 of 1950) x -DD (must be DDD)

              --121 f n/a ! x! --DDD ! (use -DDD) !
              --12 g n/a x --DD (must be DDD)


              Notes and Explanations:
              -----------------------

              At a, ISO automatically dropped the hyphen of -YY-DDD to
              make YY-DDD, just as they did at Gregorian Date 'a'.

              At b, ISO automatically dropped the hyphen of -YYDDD to
              make YYDDD, just as they did at Gregorian Date 'b'.

              For c and d, the disallowed format is the one that you would
              be logically expecting to be allowed, but see notes a and b.

              For e, you would logically expect --DDD, but with no risk
              of misinterpretation ISO automatically dropped the first
              hyphen, leaving -DDD. You could drop all hyphens, because
              this is the only three digit element in the standard
              (unless by mutual agreement you are exchanging YYY, or you
              have dropped the leading W from W527), but ISO insist here
              in having a minimum of one leading hyphen in place of all
              of the missing elements.

              For f, the disallowed format is the one that you would
              be logically expecting to be allowed, but see note e.

              At g, --12 always means --MM for Gregorian Calendar Date;
              two digit day is not allowed here.

              And, z, just confirms the old 'by mutual agreement you can
              drop leading hyphens for any of these date formats' rule,
              and use all the meanings that I have currently placed in the
              'disallowed' column if you need to do so.

              If a line has an 'x', then entries to the right of the 'x'
              in the 'disallowed' column are never allowed to be used.

              For anything marked 'n/a !' in the allowed column, the '!'
              points to surprise at the blank; that ISO have disallowed
              what you would logically expect to be there.

              A date format in the allowed column, with a '!' included, you
              would logically expect to be disallowed, but ISO included it.

              Having allowed YYMMDD in, you would expect YY on its own to
              simply be the 50 of 1950, but it actually represents the 19
              part.

              Of course, whilst formats here for Ordinal Day of Year should
              not conflict with each other, they also must not conflict with
              any format already being used for Gregorian Calendar Date, and
              vice versa. Ditto for the 'Week Number and Day of Week' format.



              There is no need to do the tables for the Week format, because
              the letter 'W' built in, before the week number, will always
              show what is going on, and the single digit day (even if it is
              on its own) has to be the Day of Week, because a single digit
              is not allowed to be used for any other Date (or Time)
              representation (except by 'mutual agreement'...).

              However, looking closely at 5.2.3.3 (c) and (d), reveals
              formats with a SINGLE digit YEAR. This is something not
              mentioned at all for Gregorian or Ordinal date. Why should
              the Week/Day format be any different? There is absolutely
              no explanation here, other than to say that I believe that
              the 1988 version of the standard used to say that formats
              were not limited to the examples listed in the standard,
              just as long as all formats followed the general rules
              about element ordering, use of separators, and consistent
              element length achieved by use of leading zeroes where
              required, etc.

              So, I guess that formats like YYY and Y-DDD and YYY-DDD
              are allowed 'by mutual agreement' but their Basic Formats
              of YYY and YDDD and YYYDDD have a real risk of being
              interpreted, wrongly, as DDD and YYYY and YYMMDD (or
              YYYYMM) respectively.



              Para 4.5, Note 1 says 'The hyphen may also be used to
              indicate omitted components', but it doesn't say whether
              these are components omitted on the left, right, or in the
              middle of a date. It isn't permitted to omit components
              from the middle of a date, but the note does give the
              initial impression that 1999-- is just as valid a date
              format as --1231 is.



              All of these complications with ISO 8601 are why, when most
              people implement the standard, they actually just list the
              formats that they are going to allow, these usually being
              just a small subset of what is actually possible. I *always*
              use a four digit year, for example.



              A BIG MISTAKE. This paragraph, and the next three, have been
              inserted after I had written the whole of this message, and
              was just about to hit the 'Send' button.

              Basic Formats do not include separators. So, how come para
              5.2.1.2 (a) does include a hyphen separator? The YYYY-MM
              format is a *Basic* Format according to the Standard. How
              can that be right? Extended Formats include separators.
              Basic Formats do not include separators, except YYYY-MM
              we are supposed to believe.

              Stand up ISO. You have been caught out. The choice of defined
              formats appears to be arbitrary. I expected to see, and have
              always understood it to be so, that the choice of formats
              was based on a logical pattern. This does not appear to be
              the case. I had already spotted a couple of dodges (regarding
              YYMMDD, YYDDD, and I thought YYYYMM), but this one point,
              'YYYY-MM', has eluded me for a very long time (circa 7
              years!). I have not adjusted my tables to reflect this
              last point, so you can work out for yourself the effect
              that the ISO kludge has on the logic. To be fair, I don't
              think ISO have ever claimed that the standard was based
              on a logical pattern, other then the highest element first
              'Year-Month-Day-Hour-Minute-Second' element ordering, and
              unambiguous representation of dates and times and zones.

              This definition for YYYY-MM carries over from the earlier
              1988 standard which can be downloaded from my FTP site at:
              <ftp://ftp.qsl.net/pub/g1smd/> if you haven't already got
              it. It means that in my first table, everything on the LEFT
              is an Extended Format, except that the entry at LEFT 'c'
              is to be included on the RIGHT and be called a Basic Format.
              That just does not make any sense at all. And, it's all
              because they wanted to define 121212 as YYMMDD, rather than
              accepting YYYYMM. The 'similar' YYYY-DDD (Ordinal Date)
              format is correctly listed as an Extended Format, so this
              listing of YYYY-MM as a Basic Format just does not make
              any sense at all. I've been duped!



              This is the most complicated thing I have written for a few
              weeks. I really hope I have caught all of the typos!



              Cheers,

              Ian.


              <mail://g1smd@...>

              <http://www.qsl.net/g1smd/>
              <http://home.freeuk.net/g1smd/>
              <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

              <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
              <ftp://ftp.qsl.net/pub/g1smd/>


              [2001-07-16]

              .end
            • Pete Forman
              ... No, were an expanded representation to be used for a six digit year it would be +121212. ... As I said in my previous message I consider that there are two
              Message 6 of 15 , Jul 17, 2001
              • 0 Attachment
                g1smd@... writes:
                > [most of the material snipped]

                > I agree that this is a bit sloppy. It needs rewriting, or some
                > additional notes. The minimum required, would be to state that the
                > year may be specified by two, or by four, or by more digits; and I
                > see a problem here... 121212 is assumed to be YYMMDD, but this
                > could be the YYYYYY 121212.

                No, were an expanded representation to be used for a six digit year it
                would be +121212.


                > Also, the 'mutual agreement' problem appears here again.
                > Representations that have the prescribed leading hyphens omitted
                > can be used only by mutual agreement... except that the format at j
                > seems to be the default, rather than the format stated in i. That
                > is, for all the others, mutual agreement is required, but for j it
                > has already been forced upon us to agree to this. In doing this,
                > you get the 'logic error' with the RIGHT c entry being disallowed
                > (in order to satisfy the {non written, as far as I can see} rule
                > that any representation can have only one implied meaning unless
                > mutual agreement has already been obtained).

                As I said in my previous message I consider that there are two
                possible reasons for omitting hyphens. Mutual agreement is one. The
                other is that hyphens should/must only be used to stand for omitted
                components in order to disambiguate.


                > I guess they had to include YYMMDD and exclude YYYYMM simply
                > because millions of computer systems were already using YYMMDD.

                Not necessarily. We are talking about a date format, it is reasonable
                to give preference to the full precision interpretation over the
                reduced one.

                In general ISO 8601 does not say that two digit years are evil. It
                passes them off as a specific case of a truncated representation.

                > The table can be rearranged to ask what a numerical format should
                > be decoded as. To keep it simple, I have not divided it into Basic
                > and Extended formats. Anything with a hyphen between elements is an
                > Extended format.

                As you probably realise, that contradicts 5.2.1.2.

                > Writing the table this way, I have included some formats that the
                > ISO standard says are 'Not Applicable'. There cannot be a way to
                > tell if '1950' is supposed to be a Basic format Year or an Extended
                > format Year. I have ignored this and included it under both styles.

                Again, the standard is clear that '1950' is basic format only. I take
                'Not Applicable' to mean "don't use this".




                What we could do with is a rationale for the standard. I wonder if
                one was produced.

                The other useful production would be a general reader. Given any
                input string it should be possible to determine whether it is basic or
                extended, full or reduced precision, expanded or not, truncated or
                not, calendar or ordinal or week. (At a higher level we need to
                determine whether a string is a date, time, interval, etc.)

                A start for this might be

                Parse as date:
                Does it contain a 'W'?
                => parse as week date
                else Does it have an even number of digits? **
                => parse as calendar date
                else
                => parse as ordinal date

                Parse as calendar date:
                Split into fields of hyphen or pair-of-digits or plus (1st only)
                Match against candidate formats

                Parse as ordinal date:
                Split into fields of hyphen or pair-of-digits or plus (1st only)
                or triple-of-digits (last only)
                Match against candidate formats

                **This assumes that expanded formats use an even number of digits for
                the year. A different approach might tolerate an odd number.
                Actually, in order to parse expanded formats the number of digits
                for the year must be known otherwise there is no way to distinguish
                days from years from centuries. Years before 0000 are problematic
                as well. But according to 4.3.2.1 mutual agreement is needed for
                years prior to 1582 anyway.

                The table for calendar dates then starts

                Number of Fields Format Section Note
                fields
                1 + illegal
                1 - illegal
                1 2 YY 5.2.1.2.c.B
                2 + - illegal
                2 + 2 +YY 5.2.1.4.d.B 1
                2 - - illegal
                2 - 2 -YY 5.2.1.3.c.B 2
                2 2 - illegal
                2 2 2 YYYY 5.2.1.2.b.B
                ...
                4 2 2 2 2 YYYYMMDD 5.2.1.1.B
                ...
                6 2 2 - 2 - 2 YYYY-MM-DD 5.2.1.1.E

                Notes:
                1 Implicitly assume that expanded representation years have 4 digits
                2 Implicitly assume that expanded representation years are positive
                or have more that 4 digits


                Different versions of tables would be needed for different expanded
                representations. Expanded and truncated representations are mutually
                exclusive. The agreement between parties has to be inspected to
                establish whether a leading hyphen means a negative year or truncated
                representation.
                --
                Pete Forman -./\.- Disclaimer: This post is originated
                WesternGeco -./\.- by myself and does not represent
                pete.forman@... -./\.- opinion of Schlumberger, Baker
                http://www.crosswinds.net/~petef -./\.- Hughes or their divisions.
              • P A Hill & E V Goodall
                ... Very good! That does qualify as stating that are all tying to be unique. ... Yes that is how I read all of the paragraphs that read like that: If, by
                Message 7 of 15 , Jul 17, 2001
                • 0 Attachment
                  Pete Forman wrote:
                  >
                  > P A Hill & E V Goodall writes:
                  > > [snip]
                  > > In fact, the standard before the first truncated format in the
                  > > opening paragraph of 5.2.1.3 says "In each case hyphens that
                  > > indicate components should be used only as indicated or shall be
                  > > omitted."
                  > >
                  > > That to me hints that some of choices are arbitrary, so don't play
                  > > around with them. Also, there is no place that states that all
                  > > formats are mutually unambiguous from each other.
                  >
                  > How about the last paragraph in 4.1.

                  Very good! That does qualify as stating that are all tying to be unique.

                  > Add hyphens where a format is ambiguous. This is bound to be
                  > arbitrary: if two formats collide one must be chosen to get the
                  > extra hyphen.

                  >
                  > Note that omitting hyphens by these rules is a separate issue to
                  > 5.2.1.3. I take the latter to mean that the communicating parties
                  > agree that, for example, two digits alone mean a month rather than
                  > using four characters of 5.2.1.3.e proper.

                  Yes that is how I read all of the paragraphs that read like that:

                  "If, by agreement, truncated representations are used the basic formats shall
                  be as specified below. In each case hyphens that indicate omitted components
                  shall be used only as indicated or shall be omitted."

                  It also tells anyone who wants to claim to be 8601 compliant to not to make up
                  a format that might be read just like one of the real formats but missing
                  only some of the hyphens.

                  For example, I can't claim to be 8601 compliant and take one leading
                  hyphen off of each of examples in 5.2.1.3.

                  Thanks for the comments on how you read 8601.

                  -Paul
                • P A Hill & E V Goodall
                  ... Actually, I was assuming just mutually unambiguous choices had been made, so was not expecting a particular format. I was thrown off by the writers of the
                  Message 8 of 15 , Jul 17, 2001
                  • 0 Attachment
                    g1smd@... wrote:
                    > > The note at 5.2.3.3 would not list only one format, but mention
                    > > all of those which one might think might have a leading dash
                    > > for missing 'century' and another for missing year pointing
                    > > out the simplification.
                    >
                    > Now I see what you are saying, I agree that the wording here
                    > is sub-optimal. You reach a place where you see a format you
                    > were not expecting, with no previous rationale as to why the
                    > format is shown like it is. Yes, the standard is deficient
                    > (unless 4.9 is where its at?) and requires extra notes.

                    Actually, I was assuming just mutually unambiguous choices
                    had been made, so was not expecting a particular format. I was
                    thrown off by the writers of the standard expecting a particular
                    format.

                    > I wasn't sure why you were hung up on this one word 'should'.
                    > Now you have explained more, then I am happy to agree with you.
                    > You are right. Although the standard works the way I have said,
                    > and the examples follow the method I have stated, nowhere in
                    > the standard does it state clearly that this is the case, or
                    > why it should be so, and several notes of clarification are
                    > obviously missing on a few examples.

                    I'm glad we got that worked out! Yes, it was more an editorial
                    analysis stated as "did I miss something", then a criticism of
                    a particular format.

                    Thanks for the interesting tables! I was just starting to
                    work on some like these myself.

                    -Paul
                  • P A Hill & E V Goodall
                    ... Let s make that: That does qualify as stating that all are trying to be unique. -Paul
                    Message 9 of 15 , Jul 18, 2001
                    • 0 Attachment
                      P A Hill & E V Goodall wrote:
                      > That does qualify as stating that are all tying to be unique.

                      Let's make that:

                      That does qualify as stating that all are trying to be unique.

                      -Paul
                    • g1smd@amsat.org
                      On 2001-Jul-16 Pete Forman wrote: [2001-Aug-01] ... I agree with that. I didn t find the words *mutually* unambiguous . Instead, I found ... unique and
                      Message 10 of 15 , Aug 1, 2001
                      • 0 Attachment
                        On 2001-Jul-16 Pete Forman wrote:


                        [2001-Aug-01]



                        >> In fact, the standard before the first truncated format in the
                        >> opening paragraph of 5.2.1.3 says "In each case hyphens that
                        >> indicate components should be used only as indicated or shall be
                        >> omitted."

                        >> That to me hints that some of choices are arbitrary, so don't play
                        >> around with them. Also, there is no place that states that all
                        >> formats are mutually unambiguous from each other. As I was reading
                        >> I was looking for just such a statement or examples that violated
                        >> the idea. I found neither, but that is no proof.

                        > How about the last paragraph in 4.1.

                        I agree with that. I didn't find the words '*mutually* unambiguous'.
                        Instead, I found '... unique and unambiguous', which I think just
                        about does the same job.



                        > I agree that the choices seem arbitrary. There may be some logic
                        > behind it though. My guess is that the rules are something like

                        > Replace an omitted component in a truncated format with a hyphen.
                        > Component may be century (first two digits of a four digit year)
                        > or decade (first three digits of a four digit year) or last two
                        > digits of the year or month or week. The term component is not
                        > defined as such but the components are listed for each of the
                        > truncated representations.

                        Not quite. I think that you appear to say that -1 is a year like
                        1981 or 2001, stated by omitting the decade. Adding the month to
                        this, to increase precision, will make -111. This can now be
                        confused with -DDD, day 111 of the year. So, you should modify
                        your statement to say that: (except for Day of Year [DDD] elements,
                        and Day of Week [D] elements) elements should always have an even
                        number of digits: YYYY, YYMM, MMDD, YYMMDD, etc. However, I can
                        see where you may have got this idea from. In the examples for
                        the various 'Week-of-Year and Day-of-Week' formats, there are
                        some three and some single digit year examples. However, in those
                        examples, the placement of the 'W' always clarifies what is going
                        on. In the Calendar and Ordinal date formats you cannot do this
                        with Basic Formats. The Year must be two or four digits, except
                        for some Extended Format dates that can have a three or single
                        digit year, because these cannot be mixed up with other formats:
                        -Y-DDD -YYY-DDD -Y-MM-DD -YYY-MM-DD and possibly -Y-MM and
                        -YYY-MM (and you can probably omit the leading hyphen on all
                        of these and get away with it). For most of these it is *not*
                        possible to have a Basic format (if 'Basic' formats are taken
                        to mean that hyphen separators *between* digits are omitted),
                        as these *will* then be confused with other pre-defined formats.



                        > Remove hyphens if the result is unambiguous.
                        > 4.6 para 1 states that a hyphen may be necessary to represent
                        > an omitted component. That implies to me that the hyphen
                        > should not be used if possible.

                        This 'ruling' seems arbitrary in the standard. A format like
                        12-12 isn't permitted at all, but 121212 is read as YYMMDD,
                        when I would expect YYYYMM to be the one. Similarly -121212
                        doesn't appear anywhere, when I think that -YYMMDD is expected
                        (like -1212 is -YYMM, for example).



                        > (The above two rules may also be expressed as: Components may
                        > be omitted, if the result is ambiguous then use a hyphen to
                        > stand for the omitted component.)

                        I almost agree with this, but I still don't understand why
                        121212 has to be YYMMDD, when YYYYMM would be more logical,
                        and this would then follow a 'pattern' with the other formats.
                        See the various tables that I included in my previous message,
                        posted 2001-Jul-16.



                        > Add hyphens where a format is ambiguous. This is bound to be
                        > arbitrary: if two formats collide one must be chosen to get
                        > the extra hyphen.

                        It isn't always arbitrary. '12' is always the first two digits
                        of a Year, so the last two digits of the Year are -12, the Month
                        is --12, and the Day is ---12. This is clear and logical. Note
                        that 12-12 isn't permitted at all, as it could be either YY-MM
                        or MM-DD. Instead -YY-MM and --MM-DD are used; so none of them
                        'get the extra hyphen' (in this context)... they both include a
                        hyphen or hyphens. So, two formats collide at '12-12' and rather
                        then one gets a hyphen, and the other doesn't, then in fact the
                        '12-12' format isn't defined/used at all.



                        > Note that omitting hyphens by these rules is a separate issue to
                        > 5.2.1.3. I take the latter to mean that the communicating parties
                        > agree that, for example, two digits alone mean a month rather than
                        > using four characters of 5.2.1.3.e proper.

                        Yes, by mutual agreement I can say that '12' in one data element
                        is MM, and in another is DD, rather than the default of YY.



                        What comments have you got regarding the material in my message
                        dated 2001-Jul-16, under the heading 'A BIG MISTAKE'?



                        Cheers,

                        Ian.


                        <mail://g1smd@...>

                        <http://www.qsl.net/g1smd/>
                        <http://home.freeuk.net/g1smd/>
                        <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

                        <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
                        <ftp://ftp.qsl.net/pub/g1smd/>


                        [2001-08-01]

                        .end
                      • g1smd@amsat.org
                        On 2001-Jul-17 Pete Forman wrote: [2001-Aug-01] ... but I forgot to repeat that note with the above text. ... Unfortunately, since ISO mixed their logic in
                        Message 11 of 15 , Aug 1, 2001
                        • 0 Attachment
                          On 2001-Jul-17 Pete Forman wrote:


                          [2001-Aug-01]



                          >> I agree that this is a bit sloppy. It needs rewriting, or some
                          >> additional notes. The minimum required, would be to state that
                          >> the year may be specified by two, or by four, or by more digits;
                          >> and I see a problem here... 121212 is assumed to be YYMMDD,
                          >> but this could be the YYYYYY 121212.

                          > No, were an expanded representation to be used for a six digit
                          > year it would be +121212.

                          I did already refer to this in another paragraph, where I noted:
                          >>> .... I see a problem here... 121212 is assumed to be YYMMDD,
                          >>> but this could be the YYYYYY 121212. Having re-read the
                          >>> standard I see that para 4.7 does cover this. Additionally,
                          >>> para 4.8 does say that elements do all have a defined length,
                          >>> and that leading zeroes must be used to fulfil this. ....
                          but I forgot to repeat that note with the above text.



                          >> Also, the 'mutual agreement' problem appears here again.
                          >> Representations that have the prescribed leading hyphens omitted
                          >> can be used only by mutual agreement... except that the format at j
                          >> seems to be the default, rather than the format stated in i. That
                          >> is, for all the others, mutual agreement is required, but for j it
                          >> has already been forced upon us to agree to this. In doing this,
                          >> you get the 'logic error' with the RIGHT c entry being disallowed
                          >> (in order to satisfy the {non written, as far as I can see} rule
                          >> that any representation can have only one implied meaning unless
                          >> mutual agreement has already been obtained).

                          > As I said in my previous message I consider that there are two
                          > possible reasons for omitting hyphens. Mutual agreement is one.
                          > The other is that hyphens should/must only be used to stand for
                          > omitted components in order to disambiguate.

                          Unfortunately, since ISO mixed their logic in deciding on YYMMDD
                          over YYYYYMM, this skews the expected logical 'pattern' of allowed
                          formats, as shown in the tables in my message posted 2001-Jul-16.
                          I think their choices of 'default' formats are somewhat arbitrary.



                          >> I guess they had to include YYMMDD and exclude YYYYMM simply
                          >> because millions of computer systems were already using YYMMDD.

                          > Not necessarily. We are talking about a date format, it
                          > is reasonable to give preference to the full precision
                          > interpretation over the reduced one.

                          So why isn't 1212 decoded as MMDD, instead of YYYY?
                          It seems very odd to me, that the formats go:
                          12121212 YYYYMMDD
                          121212 YYMMDD
                          1212 YYYY
                          12 YY (19 of 1950)
                          Surely, life would be much easier if 121212 were YYYYMM?

                          I was expecting one of the following patterns:
                          12121212 YYYYMMDD
                          121212 YYYYMM
                          1212 YYYY
                          12 YY (19 of 1950)
                          or:
                          12121212 YYYYMMDD
                          121212 YYMMDD
                          1212 MMDD
                          12 DD
                          or:
                          12121212 YYYYMMDD
                          121212 YYMMDD
                          1212 YYYY
                          12 YY (50 of 1950)
                          The last three all have a logical pattern to them, whereas
                          the first table (as derived from the ISO 8601 standard) does
                          not have a logical pattern. Have another look at the various
                          tables in my previous message (the one dated 2001-Jul-16)
                          for further information.



                          > In general ISO 8601 does not say that two digit years are evil. It
                          > passes them off as a specific case of a truncated representation.

                          Most formats that have a two digit year have a leading hyphen.
                          Only YYMMDD does not, at the expense of YYYYMM being disallowed.
                          I don't understand why.



                          >> The table can be rearranged to ask what a numerical format should
                          >> be decoded as. To keep it simple, I have not divided it into Basic
                          >> and Extended formats. Anything with a hyphen between elements is an
                          >> Extended format.

                          > As you probably realise, that contradicts 5.2.1.2.

                          That is so illogical. What is a Basic Format? What is an Extended Format?

                          A simple answer would be (you would think) that an Extended Format
                          includes separators between elements, and a Basic Format always has
                          them omitted. However, because someone at ISO decided that 121212
                          would be YYMMDD (the Basic version of YY-MM-DD), then YYYYMM has been
                          disallowed. A Year and Month always has to have a hyphen separator:
                          YYYY-MM. But why is it then called a Basic Format? This is the only
                          Basic Format in the whole standard that includes any separators.

                          I repeat, again, just what is a Basic Format? Give me a simple
                          definition. Hyphen separators are not it; unless ISO have made
                          a mistake and it is meant to be:

                          Year and Month:
                          ---------------
                          *Extended* Format: YYYY-MM
                          Basic Format: Not Applicable (because 121212 is YYMMDD)

                          but as already stated, I think ISO made a fundamental error in
                          allowing YYMMDD over YYYYMM in the first place. That is where
                          the heart of the whole problem lies.



                          >> Writing the table this way, I have included some formats that the
                          >> ISO standard says are 'Not Applicable'. There cannot be a way to
                          >> tell if '1950' is supposed to be a Basic format Year or an Extended
                          >> format Year. I have ignored this and included it under both styles.

                          > Again, the standard is clear that '1950' is basic format only.
                          > I take 'Not Applicable' to mean "don't use this".

                          Take a date like 1212-12-12, reduce the precision to 1212-12,
                          then to 1212. Now do the same with 12121212, reduce to 1212-12
                          (121212 not allowed!!), then to 1212. So, 1212-12-12 is an
                          Extended Format, and 12121212 is a Basic Format; but both reduce
                          to 1212 for just the Year. So, really, 1212 could be a Basic
                          Format or an Extended Format, there is no way to tell. What I
                          think the ISO standard means by 'Not Applicable' is simply that
                          because 1212 does not contain any hyphen separators; in other
                          words, that is, because 1212 (Extended) is exactly the same as
                          1212 (Basic) (i.e. the Extended Format does not have it's own
                          unique definition), then there is no need to repeat the
                          definition that was shown for the Basic Format. So I think
                          that 'Not Applicable' really just means that there is no unique
                          representation to show for the Extended Format, so you just use
                          the same format as is already listed for the Basic Format.
                          However, I am also assuming that the difference between an
                          Extended Format and a Basic Format is that the Basic Format
                          does not include any hyphens used as separators.



                          > What we could do with is a rationale for the standard. I wonder
                          > if one was produced.

                          > The other useful production would be a general reader. Given any
                          > input string it should be possible to determine whether it is basic
                          > or extended, full or reduced precision, expanded or not, truncated
                          > or not, calendar or ordinal or week. (At a higher level we need to
                          > determine whether a string is a date, time, interval, etc.)

                          There is NO pattern to the ISO standard. Many of the choices
                          are arbitrary... viz YYMMDD vs YYYYMM and so on. This makes
                          finding a 'simple' rule impossible.



                          > A start for this might be

                          > Parse as date:
                          > Does it contain a 'W'?
                          > => parse as week date
                          > else Does it have an even number of digits? **
                          > => parse as calendar date
                          > else
                          > => parse as ordinal date

                          > Parse as calendar date:
                          > Split into fields of hyphen or pair-of-digits or plus (1st only)
                          > Match against candidate formats

                          > Parse as ordinal date:
                          > Split into fields of hyphen or pair-of-digits or plus (1st only)
                          > or triple-of-digits (last only)
                          > Match against candidate formats

                          >**This assumes that expanded formats use an even number of digits for
                          > the year. A different approach might tolerate an odd number.
                          > Actually, in order to parse expanded formats the number of digits
                          > for the year must be known otherwise there is no way to distinguish
                          > days from years from centuries. Years before 0000 are problematic
                          > as well. But according to 4.3.2.1 mutual agreement is needed for
                          > years prior to 1582 anyway.

                          > The table for calendar dates then starts:

                          > Number of Fields Format Section Note
                          > fields
                          > 1 + illegal
                          > 1 - illegal
                          > 1 2 YY 5.2.1.2.c.B
                          > 2 + - illegal
                          > 2 + 2 +YY 5.2.1.4.d.B 1
                          > 2 - - illegal
                          > 2 - 2 -YY 5.2.1.3.c.B 2
                          > 2 2 - illegal
                          > 2 2 2 YYYY 5.2.1.2.b.B
                          > ...
                          > 4 2 2 2 2 YYYYMMDD 5.2.1.1.B
                          > ...
                          > 6 2 2 - 2 - 2 YYYY-MM-DD 5.2.1.1.E

                          I see that your table deals only with stuff that begins with the
                          Year. That is all easy. See if you can finish it, when you deal
                          with left-truncated stuff: both full and reduced precision.
                          It becomes a LOT more difficult.



                          > Notes:
                          > 1 Implicitly assume that expanded representation years have 4 digits
                          > 2 Implicitly assume that expanded representation years are positive
                          > or have more that 4 digits

                          > Different versions of tables would be needed for different expanded
                          > representations. Expanded and truncated representations are mutually
                          > exclusive. The agreement between parties has to be inspected to
                          > establish whether a leading hyphen means a negative year or truncated
                          > representation.

                          It gets very complicated doesn't it. My tables of Allowed and
                          Disallowed formats in the message dated 2001-Jul-16 may help
                          to guide you to look for logic errors.



                          Cheers,

                          Ian.


                          <mail://g1smd@...>

                          <http://www.qsl.net/g1smd/>
                          <http://home.freeuk.net/g1smd/>
                          <http://ourworld.compuserve.com/homepages/dstrange/y2k.htm>

                          <ftp://ftp.funet.fi/pub/ham/misc/g1smd.zip>
                          <ftp://ftp.qsl.net/pub/g1smd/>


                          [2001-08-01]

                          .end
                        • P A Hill & E V Goodall
                          ... I think is why increased precision is only done when the exchanging parties agree. I think this gets around the problem, that given some arbitrary sequence
                          Message 12 of 15 , Aug 2, 2001
                          • 0 Attachment
                            g1smd@... wrote:
                            > Adding the month to
                            > this, to increase precision, will make -111. This can now be
                            > confused with -DDD, day 111 of the year.

                            I think is why increased precision is only done when the exchanging
                            parties agree. I think this gets around the problem, that given some
                            arbitrary sequence can we guess what it is.

                            -Paul
                          Your message has been successfully submitted and would be delivered to recipients shortly.