Loading ...
Sorry, an error occurred while loading the content.

Re: Statistics on reusing request headers in persistent connections

Expand Messages
  • Glenn Adams
    ... RFC 1766 (which specifies the structure of the value of ACCEPT-LANGUAGE and other headers which take language tag values) specifies that 2 character
    Message 1 of 27 , Nov 1, 1995
    • 0 Attachment
      >Date: Wed, 1 Nov 1995 03:18:08 +0100 (MET)
      >From: "Balint Nagy Endre" <bne@...>
      >
      >but unfortunately Lynx doesn't support that. (NOTE: I don't know, what is
      >the proper language tag for russian, I assumed "ru" but it is only a
      >(bad) guess.) Others may have a significantly longer list of languages.

      RFC 1766 (which specifies the structure of the value of ACCEPT-LANGUAGE
      and other headers which take language tag values) specifies that 2 character
      language tags follow ISO 639 for which see:

      http://www.stonehand.com/unicode/standard/iso639.html

      However, since ISO 639 contains only 136 tags, the authors of the HTML I18N
      I-D have specified an extension such that three character language tags
      follow the language identifiers specified by the Ethnologue, 12th Edition,
      which contains 6790 3-character tags, for which, see:

      http://www.stonehand.com/unicode/standard/ethn12.html

      For more information on the Ethnologue, see:

      gopher://sil.org:70/11/gopher_root/ethnologue/

      I would recommend that the HTTP specifications also be amended to specify
      the same convention for using three character language identifiers since
      this convention will be employed by HTML in specifying the value of the LANG
      attribute.

      Regards,
      Glenn Adams
    • Koen Holtman
      ... [...] ... Well, 200 bytes plus 60 bytes for Accept-Language is still a lot lower than the 500 byte request messages at which header reuse would get
      Message 2 of 27 , Nov 1, 1995
      • 0 Attachment
        Balint Nagy Endre:
        >Koen's modelling does not contain Accept-Language, which will be important
        >in future, adding some bytes to headers.
        [...]
        >Accept-Language: hu; q=1, en; q=0.75, ru; q=0.5, de;q=0.25
        >(60 bytes, including CRLF)

        Well, 200 bytes plus 60 bytes for Accept-Language is still a lot lower than
        the 500 byte request messages at which header reuse would get moderately
        interesting.

        And of course, there is no need to ever send Accept-Language headers at all
        is we have a good(*) specification and implementation of reactive content
        negotiation.

        (*) By my defition of good, that is: I feel that good reactive negotiation
        will always give me a `multiple options' or `none acceptable' with the
        alternatives if my accept headers are missing or ambiguous.

        A browser should only send Accept-language headers if it suspects that doing
        so will prevent a reactive content negotiation cycle. Thus, you start
        sending Accept-language if you got a `multiple options' response to an
        earlier request on that server, and it turned out that language was relevant
        in the options.

        Also, in a hash-based content negotiation scheme, the Accept-language header
        could be hashed along with the Accept header.

        Basically, everything I said about Accept: im my report also holds for
        Accept-language: .

        >[Jeffrey Mogul writes:]
        >> I think Larry Masinter's hash-based approach still seems like the
        >> right one here.

        Balint Nagy Endre:
        >Agree.

        I do not like the hash-based approach very much. It seems to me that it is
        a special case of reactive negotiation.

        One of the problems I have with hash-based content selection is that it is
        that it places a high penalty (one round trip time) on the first content
        selection by the server. That means that putting up a home page with inline
        links that serve .gifs in the normal case, but .jpgs if the browser can
        display them is less attractive.

        Uner normal reactive negotiation, you could send either
        Accept: image/gif image/jpg
        or
        Accept: image/gif
        and no extra round trip would be required.

        Koen.
      • Larry Masinter
        ... It is. ... Don t use it for embedded gifs and jpgs. In particular, if your accept headers are short (as they are for image/gif image/jpg) the hash would be
        Message 3 of 27 , Nov 1, 1995
        • 0 Attachment
          > I do not like the hash-based approach very much. It seems to me that it is
          > a special case of reactive negotiation.

          It is.

          > One of the problems I have with hash-based content selection is that it is
          > that it places a high penalty (one round trip time) on the first content
          > selection by the server. That means that putting up a home page with inline
          > links that serve .gifs in the normal case, but .jpgs if the browser can
          > display them is less attractive.

          Don't use it for embedded gifs and jpgs. In particular, if your accept
          headers are short (as they are for image/gif image/jpg) the hash would
          be longer than the material hashed, a bad idea.

          I only proposed this as an optimization in the case where the Accept*
          headers are large.
        • Harald.T.Alvestrand@uninett.no
          Glenn, thanks for reminding me.... I object strongly to the paragraph in I18N that says: Two-letter primary-tags are reserved for ISO 639 language abbrevia-
          Message 4 of 27 , Nov 2, 1995
          • 0 Attachment
            Glenn,
            thanks for reminding me....

            I object strongly to the paragraph in I18N that says:


            Two-letter primary-tags are reserved for ISO 639 language abbrevia-
            tions [ISO-639], and three-letter primary-tags for the language
            abbreviations of the "Ethnologue" [ETHNO] (the latter is in addition
            to the requirements of RFC 1766). Any two-letter initial subtag is an
            ISO 3166 country code [ISO-3166].

            The reason is what I stated in RFC 1766:


            The reason for reserving all other tags is to be open towards new
            revisions of ISO 639; the use of "i" and "x" is the minimum we can do
            here to be able to extend the mechanism to meet our requirements.

            If you wish to register I-SIL-nnn as a standard for three-letter
            Ethnologue-based tags, or even want to push for updating RFC 1766 to include S-nnn as a new category, I would not argue against that, but I would like to stick to the principle of using ISO standards for the basic namespace.

            Otherwise, we will end up with a really confusing situation once the
            ISO 3166 three-letter project finishes (if it ever does); its tags are
            SURE to conflict with the SIL tag.

            Harald A
          • Glenn Adams
            From: Harald.T.Alvestrand@uninett.no Date: Thu, 02 Nov 1995 11:33:53 +0100 If you wish to register I-SIL-nnn as a standard for three-letter Ethnologue-based
            Message 5 of 27 , Nov 2, 1995
            • 0 Attachment
              From: Harald.T.Alvestrand@...
              Date: Thu, 02 Nov 1995 11:33:53 +0100

              If you wish to register I-SIL-nnn as a standard for three-letter
              Ethnologue-based tags, or even want to push for updating RFC 1766 to
              include S-nnn as a new category, I would not argue against that, but
              I would like to stick to the principle of using ISO standards for
              the basic namespace.

              Since IANA does not otherwise use such a principle I don't know why you
              would adopt it here, let alone insist on it.

              Otherwise, we will end up with a really confusing situation once the
              ISO 3166 three-letter project finishes (if it ever does); its tags are
              SURE to conflict with the SIL tag.

              Even more incongruent does your "principle" appear given the laggardness
              of this particular ISO work item. It is extremely unlikely that ISO or
              anyone else for that matter will do as comprehensive a job as SIL has done
              in creating their language database.

              I was rather surprised to learn that you did not even know about the
              Ethnologue database prior to writing your RFC. Let's just forget about 3166
              and use what exists, namely 639 for 2 letter codes and the Ethnologue for
              three letter codes. Unless you can give a firm estimate of when (or if)
              ISO is going to actually produce a revision to 639, then your objection
              surely sounds quite empty to me.

              Regards,
              Glenn
            • Koen Holtman
              Larry Masinter: [...hash based approach to content negotiation...] ... If it is an optional optimization, I have no problems at all with it. I don t believe I
              Message 6 of 27 , Nov 2, 1995
              • 0 Attachment
                Larry Masinter:

                [...hash based approach to content negotiation...]

                >Don't use it for embedded gifs and jpgs. In particular, if your accept
                >headers are short (as they are for image/gif image/jpg) the hash would
                >be longer than the material hashed, a bad idea.
                >
                >I only proposed this as an optimization in the case where the Accept*
                >headers are large.

                If it is an optional optimization, I have no problems at all with it.

                I don't believe I ever saw a complete description of how hashing would
                fit in with the rest of the negotiation stuff. Is there such a
                description, or do you plan to write one?

                Koen.
              • Ned Freed
                ... If there s even the slightest chance that multiple three letter code schemes will exist in the future then the S-nnn approach seems like the way to go to
                Message 7 of 27 , Nov 2, 1995
                • 0 Attachment
                  > If you wish to register I-SIL-nnn as a standard for three-letter
                  > Ethnologue-based tags, or even want to push for updating RFC 1766 to
                  > include S-nnn as a new category, I would not argue against that, but
                  > I would like to stick to the principle of using ISO standards for
                  > the basic namespace.

                  If there's even the slightest chance that multiple three letter code schemes
                  will exist in the future then the S-nnn approach seems like the way to go to
                  me.

                  > Since IANA does not otherwise use such a principle I don't know why you
                  > would adopt it here, let alone insist on it.

                  Incorrect. IANA does use such a principle elsewhere: In handling the
                  registration of top-level domains it follows ISO country codes. There was a lot
                  of wailing about this policy at one point but as far as I know it has never
                  been changed.

                  > Otherwise, we will end up with a really confusing situation once the
                  > ISO 3166 three-letter project finishes (if it ever does); its tags are
                  > SURE to conflict with the SIL tag.

                  > Even more incongruent does your "principle" appear given the laggardness
                  > of this particular ISO work item. It is extremely unlikely that ISO or
                  > anyone else for that matter will do as comprehensive a job as SIL has done
                  > in creating their language database.

                  The problem as I see it is the adoption of a different set of criteria for
                  language tags here than what was used in the content-language work. I for one
                  don't especially care what we use as long as it is the same everywhere. I do
                  NOT want to have two different language tag namespaces to contend with. In fact
                  I'm not particularly happy with the notion of having overlapping subspaces, but
                  I think that's inevitable.

                  > I was rather surprised to learn that you did not even know about the
                  > Ethnologue database prior to writing your RFC. Let's just forget about 3166
                  > and use what exists, namely 639 for 2 letter codes and the Ethnologue for
                  > three letter codes. Unless you can give a firm estimate of when (or if)
                  > ISO is going to actually produce a revision to 639, then your objection
                  > surely sounds quite empty to me.

                  You may not approve of it, but we have a proposed standard for language tags in
                  place -- RFC1766. And this is what HTTP needs to use. Defining another scheme
                  specifically for HTTP is not acceptable.

                  Now, it may well be that RFC1766 is broken. If it is then we need to fix it.
                  Its almost time for it to be up for review anyhow. It may be somewhat harder to
                  fix RFC1766 than to invent an incompatible scheme for HTTP, but the difficulty
                  of the task should not prevent us from doing the right thing.

                  In summary, the only thing here that "sounds empty" to me is the notion that
                  its acceptable to have two different sets of language tags in different IETF
                  work items and that its acceptable avoid revising documents that need revision.
                  Glenn may have an excellent case for putting the SIL codes in RFC1766. He may
                  even have a case for putting the SIL codes in without an "S-" introducer and
                  putting the introducer on the son-of-639 codes should they ever appear. If so,
                  he needs to bring this up with the WG that produced the content-language
                  specification -- the MAILEXT WG, I believe.

                  In fact it may even be possible to revise the tags specification and get a new
                  version of it out before the work on HTTP is finished. The MAILEXT WG has a
                  proven track record of getting more work done faster than any other group I've
                  been associated with.

                  Ned
                • Roy T. Fielding
                  Just a clarification. HTTP is *not* going to use a language tag that is different than that defined by RFC 1766 (or it successors). Any change proposed must
                  Message 8 of 27 , Nov 2, 1995
                  • 0 Attachment
                    Just a clarification. HTTP is *not* going to use a language tag that
                    is different than that defined by RFC 1766 (or it successors). Any change
                    proposed must first be made to that document, and that should be done
                    within the MAILEXT WG (or through discussion with Harald, the author).

                    It is not a topic for discussion within the HTTP WG.


                    ...Roy T. Fielding
                    Department of Information & Computer Science (fielding@...)
                    University of California, Irvine, CA 92717-3425 fax:+1(714)824-4056
                    http://www.ics.uci.edu/~fielding/
                  • Larry Masinter
                    ... I m not sure if content negotiation deserves a separate document from the rest of HTTP, or whether all of the issues are independent. I don t currently
                    Message 9 of 27 , Nov 2, 1995
                    • 0 Attachment
                      > I don't believe I ever saw a complete description of how hashing would
                      > fit in with the rest of the negotiation stuff. Is there such a
                      > description, or do you plan to write one?

                      I'm not sure if content negotiation deserves a separate document from
                      the rest of HTTP, or whether all of the issues are independent. I
                      don't currently have any plans to write up this idea beyond what's
                      been posted, unless there's a call for it.
                    • Harald.T.Alvestrand@uninett.no
                      Glenn, what I desire is that we have a single decision that is valid every time we want to indicate a language. In this case, it means that if you think 1766
                      Message 10 of 27 , Nov 3, 1995
                      • 0 Attachment
                        Glenn,
                        what I desire is that we have a single decision that is valid
                        every time we want to indicate a language.

                        In this case, it means that if you think 1766 is broken, you should
                        work to change 1766, not introduce an incompatible naming scheme
                        within the HTML I18N work.

                        (You're not the only one - I also complained to the SRVLOC group
                        about their desire to use a fixed-length language field of FOUR
                        characters. SIL codes would fit right in...)

                        When we choose to rely on an outside source for names, we have to
                        consider:

                        - Is the source available?
                        - Is the source stable?
                        - Is the source reliable?
                        - Is the source policy acceptable to our members?

                        In the case of ISO, we might not like everything they are doing,
                        but its warts are something we have grown used to over the years.

                        In the case of SIL, I know that they chose to do their work for
                        a specific purpose (supporting and targeting the work of bible
                        translation into new languages), but I do not know anything about
                        their policies about changes to the language database, documentation
                        of changes or reuse of deassigned identifiers.

                        I don't say that these argue against SIL codes, just that we have
                        to know what we are doing, and make sure we change the standard
                        in the right place.

                        Harald A
                      • Glenn Adams
                        Date: Thu, 02 Nov 1995 16:51:07 -0800 (PST) From: Ned Freed Now, it may well be that RFC1766 is broken. If it is then we need to fix it. I
                        Message 11 of 27 , Nov 3, 1995
                        • 0 Attachment
                          Date: Thu, 02 Nov 1995 16:51:07 -0800 (PST)
                          From: Ned Freed <NED@...>

                          Now, it may well be that RFC1766 is broken. If it is then we need to
                          fix it.

                          I don't believe I said it was "broken". I would simply say it is in
                          need of augmenting as described under the text for "primary language tags":

                          "Other values cannot be assigned except by updating this standard."

                          In summary, the only thing here that "sounds empty" to me is the
                          notion that its acceptable to have two different sets of language tags
                          in different IETF work items and that its acceptable avoid
                          revising documents that need revision.

                          OK, then lets update 1766. I propose that 3 character primary language
                          tags be those assigned by the Ethnologue, 12th edition, this list being
                          available here:

                          http://www.stonehand.com/unicode/standard/ethn12.html

                          This list contains 6790 language identifiers covering the entire known
                          set of living and extinct languages.

                          The HTML I18N I-D has already proposed this extension due to the current
                          limited coverage of 2 character ISO 639 language tags.

                          Would you like me to write up an I-D?

                          Regards,
                          Glenn Adams
                        • Glenn Adams
                          Date: Thu, 02 Nov 1995 18:14:17 -0800 From: Roy T. Fielding Just a clarification. HTTP is *not* going to use ... Wow! I wish I
                          Message 12 of 27 , Nov 3, 1995
                          • 0 Attachment
                            Date: Thu, 02 Nov 1995 18:14:17 -0800
                            From: "Roy T. Fielding" <fielding@...>

                            Just a clarification. HTTP is *not* going to use ...

                            Wow! I wish I could say "HTML is *not* going to use BLINK, FONT,
                            multiple BODY elements, ..." It would sure make life more easy.

                            Come on Roy, whose crystal ball are you using?

                            Regards,
                            Glenn
                          • Glenn Adams
                            From: Harald.T.Alvestrand@uninett.no Date: Fri, 03 Nov 1995 10:06:25 +0100 I don t say that these argue against SIL codes, just that we have to know what we
                            Message 13 of 27 , Nov 3, 1995
                            • 0 Attachment
                              From: Harald.T.Alvestrand@...
                              Date: Fri, 03 Nov 1995 10:06:25 +0100

                              I don't say that these argue against SIL codes, just that we have
                              to know what we are doing, and make sure we change the standard
                              in the right place.

                              Well, I don't know how pertinent it is to know about agendas here. We
                              could certainly question ISO in this regard also. It seems to me that
                              SIL's agenda here is quite irrelevant and that it is the end product of
                              their agenda that is useful. Namely, the language list and identifiers
                              list itself. If we should make any judgements, it should be along the
                              lines of comprehensiveness, which nobody can fault SIL for.

                              I'd like to improve 1766 to make it more comprehensive. You couldn't
                              help it that 639 is so limited. So I'm really not faulting 1766. It
                              seems it is important to recognize its limitations based on 639's limits
                              and move on from there. Given that the SIL list is available and a new
                              list from ISO is not forthcoming (its been stuck as a CD for over 5
                              years now with little activity), I think that we should move ahead and
                              use the SIL list.

                              If would be pleased to assist in improving 1766 so it can be a single,
                              comprehensive language tag standard. Something I also want.

                              Regards,
                              Glenn
                            • Koen Holtman
                              ... Oh, I was not thinking along the lines of a separate document, more along the lines of a message on http-wg. I just searched the http-wg archives, and the
                              Message 14 of 27 , Nov 3, 1995
                              • 0 Attachment
                                Larry Masinter:
                                >
                                >> I don't believe I ever saw a complete description of how hashing would
                                >> fit in with the rest of the negotiation stuff. Is there such a
                                >> description, or do you plan to write one?
                                >
                                >I'm not sure if content negotiation deserves a separate document from
                                >the rest of HTTP, or whether all of the issues are independent.

                                Oh, I was not thinking along the lines of a separate document, more
                                along the lines of a message on http-wg. I just searched the http-wg
                                archives, and the only thing I would find was a message from you
                                archived as

                                http://www.ics.uci.edu/pub/ietf/http/hypermail/1995q3/0674.html :

                                |Did you miss the suggestion that clients hash all
                                |'content-type-determining headers' and send them as a 'accept-hash:'
                                |instead? I suppose I'm choosing Good and Fast, at the expense of a
                                |little extra implementation complexity. One way to think of this is
                                |that hashing is a kind of compression mechanism -- you can compress
                                |any amount of data into 128 bits, but decompression can be very slow
                                |and take a large amount of communication.

                                I don't believe I ever saw the original suggestion referred to above,
                                and I can't find it in the archive. Perhaps it was sent in the
                                timeframe that the http-wg mail server was broken and only delivering
                                50% of all messages on a good day?

                                Things I would like to know is:

                                - how does a server resolve a 'hash miss'?

                                - If a server gets
                                GET /blah/picture HTTP/1.0
                                Accept: image/gif;q=0.5
                                Accept-Hash: 4592462137846218

                                and it has an image/gif and an image/jpg variant, must, may, or must
                                it not resolve the hash to see if there is an image/jpg;q=1.0 in it?

                                > I
                                >don't currently have any plans to write up this idea beyond what's
                                >been posted, unless there's a call for it.

                                Could you check if your original accept-hash proposal is in the
                                http-wg archives? I could not find it, but that does not mean it is
                                not there. If you can't find it either, I suggest you repost the
                                original article (if you still have it) either now or at the time we
                                start discussing negotiation again.

                                Koen.
                              • Ned Freed
                                ... No. You need to bring this up in the proper WG (MAILEXT) as a suggested enhancement to RFC1766. There is no need for a separate, additional document. This
                                Message 15 of 27 , Nov 3, 1995
                                • 0 Attachment
                                  > OK, then lets update 1766. I propose that 3 character primary language
                                  > tags be those assigned by the Ethnologue, 12th edition, this list being
                                  > available here:

                                  > http://www.stonehand.com/unicode/standard/ethn12.html

                                  > This list contains 6790 language identifiers covering the entire known
                                  > set of living and extinct languages.

                                  > The HTML I18N I-D has already proposed this extension due to the current
                                  > limited coverage of 2 character ISO 639 language tags.

                                  > Would you like me to write up an I-D?

                                  No. You need to bring this up in the proper WG (MAILEXT) as a suggested
                                  enhancement to RFC1766. There is no need for a separate, additional document.
                                  This then needs to be discussed there and, if approved, I'm sure Harald will
                                  have no problem with adding this to the next revision of RFC1766. As I said
                                  before, the mandatory six month review period for RFC1766 is nearly up so this
                                  is an ideal time to make such a change.

                                  This isn't going to be totally trivial, however -- there are substantive
                                  questions regarding the stability and update mechanisms for the SIL work that
                                  need to be answered before its clear that the SIL codes can in fact be used. In
                                  particular, a URL is *not* an adequate means of communicating such things --
                                  the standards process currently requires either a publication to cite,
                                  publication of the list as an RFC (thus creating something that can be cited),
                                  or establishment of some form of registration authority along the lines of
                                  IANA. I do not believe a URL is an acceptable substitute.

                                  These are all largely procedural questions, and it should be possible to deal
                                  with them all, but they do have to be dealt with.

                                  Ned
                                • Ned Freed
                                  I apologize for continuing this discussion on the HTTP list. It really needs to move to the MAILEXT list ASAP. ... I think you re missing the point Harald is
                                  Message 16 of 27 , Nov 3, 1995
                                  • 0 Attachment
                                    I apologize for continuing this discussion on the HTTP list. It really needs to
                                    move to the MAILEXT list ASAP.

                                    > I don't say that these argue against SIL codes, just that we have
                                    > to know what we are doing, and make sure we change the standard
                                    > in the right place.

                                    > Well, I don't know how pertinent it is to know about agendas here. We
                                    > could certainly question ISO in this regard also. It seems to me that
                                    > SIL's agenda here is quite irrelevant and that it is the end product of
                                    > their agenda that is useful. Namely, the language list and identifiers
                                    > list itself. If we should make any judgements, it should be along the
                                    > lines of comprehensiveness, which nobody can fault SIL for.

                                    I think you're missing the point Harald is trying to make here. The question
                                    isn't one of whether or not the codes are technically accurate, fairly
                                    assigned, or anything like that. The question is one of how stable the list is
                                    and what methods are used to update it. And this agenda *is* pertinent. In fact
                                    its absolutely crucial.

                                    You've provided the list in the form of a URL that appears to be attached to
                                    your site. Suppose we issue a document tomorrow containing the URL and you go
                                    out of business the day after that? The URL is no longer valid and the list can
                                    no longer be obtained. I suppose someone could call SIL and find a new
                                    location, assuming one exists, but that's really not acceptable.

                                    Suppose there's an academic revolt amongst the linguists at SIL and the list is
                                    radically revised. (You know as well as I do that such things do happen in
                                    academia from time to time.) The document is then updated with the new list and
                                    many old codes either vanish or are assigned to different things. Chaos insues.

                                    In order to use the list as a standard we need some assurance that these sorts
                                    of things will not happen. We normally get that assurance by citing specific,
                                    published entities. You may not like the ISO (I have a number of problems with
                                    them myself), but their process insures that a citation to a given document
                                    will always remain valid. Similarly, publication as an RFC provides a stable
                                    reference and also insures that updates will be made according to IETF
                                    guidelines. And these are not the only ways such assurances can be provided --
                                    they simply examples of how it has been done in the past.

                                    The simplest thing to do would be to publish the current list as an RFC.
                                    Updates from the SIL would be possible and probably encouraged, but would be
                                    subject to some review by the IETF to insure compatibility is maintained. (I
                                    see absolutely no point in reviewing things as to technical linguistic accuracy
                                    in the IETF. The IETF lacks the expertise to make such judgements. On the other
                                    hand, the IETF does have the competence to assess compatibility issues
                                    according to the installed base, which is something the SIL does not have.)

                                    Just because its the simplest thing to do doesn't make it the best thing,
                                    however. Other approaches are possible. This list may well have appeared in a
                                    book or journal somewhere. If so, that could always be cited, and the citation
                                    could be updated by updating RFC1766 as necessary.

                                    > I'd like to improve 1766 to make it more comprehensive. You couldn't
                                    > help it that 639 is so limited. So I'm really not faulting 1766. It
                                    > seems it is important to recognize its limitations based on 639's limits
                                    > and move on from there. Given that the SIL list is available and a new
                                    > list from ISO is not forthcoming (its been stuck as a CD for over 5
                                    > years now with little activity), I think that we should move ahead and
                                    > use the SIL list.

                                    > If would be pleased to assist in improving 1766 so it can be a single,
                                    > comprehensive language tag standard. Something I also want.

                                    Good. Then let's get going on it in the MAILEXT WG.

                                    Ned
                                  • Olle Jarnefors
                                    Glenn Adams wrote in message ... One thing that I don t understand is why you insist on 3-letter tags nnn instead of 7-letter tags
                                    Message 17 of 27 , Nov 3, 1995
                                    • 0 Attachment
                                      Glenn Adams <glenn@...> wrote in message
                                      <9511021047.AA02653@...>:

                                      > From: Harald.T.Alvestrand@...
                                      > Date: Thu, 02 Nov 1995 11:33:53 +0100
                                      >
                                      > If you wish to register I-SIL-nnn as a standard for three-letter
                                      > Ethnologue-based tags, or even want to push for updating RFC 1766 to
                                      > include S-nnn as a new category, I would not argue against that, but
                                      > I would like to stick to the principle of using ISO standards for
                                      > the basic namespace.
                                      >
                                      > Since IANA does not otherwise use such a principle I don't know why you
                                      > would adopt it here, let alone insist on it.

                                      One thing that I don't understand is why you insist on 3-letter
                                      tags nnn instead of 7-letter tags I-S-nnn for Ethnologue-based
                                      language labels. How can 4 bytes per tag make such a difference?
                                      _No_ changes of RFC 1766 at all are needed to register tags of
                                      the form I-S-nnn.

                                      > Otherwise, we will end up with a really confusing situation once the
                                      > ISO 3166 three-letter project finishes (if it ever does); its tags are
                                      > SURE to conflict with the SIL tag.
                                      >
                                      > Even more incongruent does your "principle" appear given the laggardness
                                      > of this particular ISO work item.

                                      You will not have to wait for ISO to get its act together, if
                                      I-S-nnn tags are registered. _If_ a 3-letter extension of
                                      ISO 639 is eventually adopted, the 3-letter langauge tag space
                                      should still be available, should IETF find these codes useful.

                                      > It is extremely unlikely that ISO or
                                      > anyone else for that matter will do as comprehensive a job as SIL has done
                                      > in creating their language database.

                                      A possible future 3-letter ISO standard will probably have other
                                      desirable properties:
                                      + Wide acceptance in the bibliographic and lingusitic
                                      communities that pioneered standardized language tagging.
                                      + Extensive experience of a very similar language tagging
                                      system (*).
                                      + Interoperability with the internationalization work in
                                      ISO/JTC1/SC22 (Posix etc.).
                                      + Low frequency of errors and ambiguity thanks to careful
                                      scrutiny by lingusitic experts.
                                      + An open and international standardization process, minimizing
                                      the risk for accusations of political bias and other
                                      neonationalistic fuzz (**).
                                      + The automatic authority possessed by ISO-stamped standards in
                                      culturally and nationally sensitive fields.

                                      (*) The proposed 3-letter extension of ISO 639 was essentially
                                      the Library of Congress language tagging system, with additions
                                      of alternative codes for some languages, thought to be more
                                      considerate and acceptable to the users of those languages.

                                      (**) I'm talking about _real_ politics here, not the toy
                                      "politics" of the fights between different networking standards.

                                      But the important thing is that we don't have to choose between
                                      these two systems, or try to predict the outcome and time frames
                                      of the ISO work: If a new ISO standard does come into existance,
                                      it can be added to the already registered tags, including
                                      I-S-nnn tags defined according to the Ethnologue project.

                                      /Olle

                                      --
                                      Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@...>
                                    • Olle Jarnefors
                                      ... The I-S-nnn approach is fully consistent with the current RFC 1766. Is a revision of the RFC justified, only to save 2 bytes? ... Does the MAILEXT WG still
                                      Message 18 of 27 , Nov 3, 1995
                                      • 0 Attachment
                                        Ned Freed <NED@...> wrote in message <01HX6C9585U29BVL0H@...>:

                                        > If there's even the slightest chance that multiple three letter code schemes
                                        > will exist in the future then the S-nnn approach seems like the way to go to
                                        > me.

                                        The I-S-nnn approach is fully consistent with the current RFC
                                        1766. Is a revision of the RFC justified, only to save 2 bytes?

                                        > In summary, the only thing here that "sounds empty" to me is the notion that
                                        > its acceptable to have two different sets of language tags in different IETF
                                        > work items and that its acceptable avoid revising documents that need revision.
                                        > Glenn may have an excellent case for putting the SIL codes in RFC1766. He may
                                        > even have a case for putting the SIL codes in without an "S-" introducer and
                                        > putting the introducer on the son-of-639 codes should they ever appear. If so,
                                        > he needs to bring this up with the WG that produced the content-language
                                        > !? specification -- the MAILEXT WG, I believe.

                                        Does the MAILEXT WG still exist? The latest minutes seems to be
                                        mailext-minutes-95apr.txt, which says:

                                        : The working group should conclude its work by the Stockholm IETF. (Note:
                                        : the Area Directors would like to see the group conclude its work prior
                                        : to the IETF.)

                                        How can one in general find out which IETF WGs are not yet
                                        disbanded? Is there any always up-to-date database covering WGs?
                                        Is dissolution of WGs always announced on the ietf-announce
                                        list, with some unique substring in the Subject header?

                                        /Olle

                                        --
                                        Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@...>
                                      • Olle Jarnefors
                                        Glenn Adams wrote in message ... The way to do that is to use the registration procedure for language tags defined in RFC 1766. This RFC
                                        Message 19 of 27 , Nov 3, 1995
                                        • 0 Attachment
                                          Glenn Adams <glenn@...> wrote in message
                                          <9511031019.AA02999@...>:

                                          > I'd like to improve 1766 to make it more comprehensive. You couldn't
                                          > help it that 639 is so limited. So I'm really not faulting 1766. It
                                          > seems it is important to recognize its limitations based on 639's limits
                                          > and move on from there.

                                          The way to do that is to use the registration procedure for
                                          language tags defined in RFC 1766. This RFC was never meant to
                                          define a comprehensive set of langauge tags itself.

                                          > If would be pleased to assist in improving 1766 so it can be a single,
                                          > comprehensive language tag standard. Something I also want.

                                          The IANA language tag registry may become that comprehensive
                                          standard. No revision of 1766 is needed to achieve that.

                                          /Olle

                                          --
                                          Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@...>
                                        • Ned Freed
                                          ... It isn t clear to me that this is actually allowed under the rules RFC1766 sets up for IANA registrations. There is nothing in there that seems to allow
                                          Message 20 of 27 , Nov 3, 1995
                                          • 0 Attachment
                                            > > If there's even the slightest chance that multiple three letter code schemes
                                            > > will exist in the future then the S-nnn approach seems like the way to go to
                                            > > me.

                                            > The I-S-nnn approach is fully consistent with the current RFC
                                            > 1766. Is a revision of the RFC justified, only to save 2 bytes?

                                            It isn't clear to me that this is actually allowed under the rules RFC1766 sets
                                            up for IANA registrations. There is nothing in there that seems to allow for
                                            registration of an entire block to an agency outside of IANA. And even if there
                                            were, the registration would still have to go through the registration process,
                                            at which time exactly the same set of issues are going to have to be evaluated
                                            and discussed. You gain nothing here except the time it takes to revise an RFC.

                                            And what's more, you lose a lot more than two bytes. You lose completeness
                                            of specification. A registration of this sort doesn't have to appear in an
                                            RFC, so when someone wonders what I-S-drofnats or whatever is they will have
                                            to:

                                            (1) Dig up the current RFC. Find the pointer in it to the current registration
                                            policy.
                                            (2) Dig up the registration policy. Find the pointer in it to the registered
                                            names table.
                                            (3) Look through the table and find the pointer to the allocated subspace.
                                            (4) Find the entry in the allocated subspace.

                                            This seems like a lot more work than it should be. Why not simply have an RFC
                                            that contains a pointer to the definitive table? Its not like all the
                                            indirection buys you anything.

                                            > > In summary, the only thing here that "sounds empty" to me is the notion that
                                            > > its acceptable to have two different sets of language tags in different IETF
                                            > > work items and that its acceptable avoid revising documents that need revision.
                                            > > Glenn may have an excellent case for putting the SIL codes in RFC1766. He may
                                            > > even have a case for putting the SIL codes in without an "S-" introducer and
                                            > > putting the introducer on the son-of-639 codes should they ever appear. If so,
                                            > > he needs to bring this up with the WG that produced the content-language
                                            > > !? specification -- the MAILEXT WG, I believe.

                                            > Does the MAILEXT WG still exist? The latest minutes seems to be
                                            > mailext-minutes-95apr.txt, which says:

                                            > : The working group should conclude its work by the Stockholm IETF. (Note:
                                            > : the Area Directors would like to see the group conclude its work prior
                                            > : to the IETF.)

                                            Working groups exist by definition until the standards-track documents they are
                                            working on reach the status of Standard or are removed from the standards
                                            track. Groups "conclude their work" all the time, only to be revived to handle
                                            the transition from proposed to draft or draft to standard.

                                            Its also possible for documents to move along without WG reactivation. However,
                                            in the case of a poprosal to make a substantive change to a specification (and
                                            I don't see how you can view this in any other way), it seems inevitable that
                                            the group will have to reactivate to consider it.

                                            > How can one in general find out which IETF WGs are not yet
                                            > disbanded? Is there any always up-to-date database covering WGs?
                                            > Is dissolution of WGs always announced on the ietf-announce
                                            > list, with some unique substring in the Subject header?

                                            As far as I know the concept of "disbanding" a group doesn't exist. The mailing
                                            list always continues to operate. Most groups eventually reach a state of
                                            permanent dormancy that they never escape from, and are eventually rendered
                                            completely irrelevant by other work items that replace the specifications they
                                            originally produced.

                                            ned
                                          • Larry Masinter
                                            Date: Mon, 10 Jul 95 14:01:24 EST To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com In-reply-to: Alex Hopmann s message of Mon, 10 Jul 1995 13:45:11 -0700
                                            Message 21 of 27 , Nov 3, 1995
                                            • 0 Attachment
                                              Date: Mon, 10 Jul 95 14:01:24 EST
                                              To: http-wg%cuckoo.hpl.hp.com@...
                                              In-reply-to: Alex Hopmann's message of Mon, 10 Jul 1995 13:45:11 -0700 <199507102045.NAA08852@...>
                                              Subject: Re: HTTP Session Extension draft
                                              From: Larry Masinter <masinter@...>

                                              It might improve net efficiency (and possibly allow servers to
                                              precompute information or ignore these headers if they don't care) to
                                              package together those things that are configuration specific (accept,
                                              accept-encoding, accept-charset, accept-language and user-agent:) and
                                              send them by reference, e.g.,

                                              the client sends:

                                              accept-hash: NNNNNNNNNNNNNNN

                                              where NNNNNNNNNNNNNN is the MD5 of the omitted headers; the server
                                              sends back an error return if it actually needs the fields.

                                              This would be useful independent of whether the connection remains
                                              open: even if the connection closes, the information might affect a
                                              cache choice; even if the connection remains open, a proxy might want
                                              to send different header information when proxying for different
                                              clients.

                                              (Clearly this would be in 1.1; if HTTP were recast as ILU or CORBA, it
                                              would be done as a client object that the server could interrogate.)
                                            • Glenn Adams
                                              Date: Fri, 03 Nov 1995 09:36:58 -0800 (PST) From: Ned Freed You ve provided the list in the form of a URL that appears to be attached to
                                              Message 22 of 27 , Nov 9, 1995
                                              • 0 Attachment
                                                Date: Fri, 03 Nov 1995 09:36:58 -0800 (PST)
                                                From: Ned Freed <NED@...>

                                                You've provided the list in the form of a URL that appears to be
                                                attached to your site.

                                                I wasn't making an effort to be formally complete. So you shouldn't assume
                                                that I cannot provide a more stable reference. See:

                                                Ethnologue, Languages of the World, Twelfth Edition, Barbara F. Grimes,
                                                Editor, 1992, SIL, Dallas TX. ISBN 0-88312-815-2.

                                                Ethnologue Index, Twelfth Edition, Barbara F. Grimes, Editor, 1992, SIL,
                                                Dallas TX. ISBN 0-88312-819-5.

                                                I'd be happy to move this discussion over to the MAILEXT WG, though I'm
                                                not entirely convinced that is the pertinent forum for it to occur. At
                                                least in the sense that the issue of language tags extends to all application
                                                areas, not simply mail.

                                                Regards,
                                                Glenn Adams
                                              • Glenn Adams
                                                Date: Fri, 3 Nov 95 21:02:21 +0100 From: Olle Jarnefors One thing that I don t understand is why you insist on 3-letter tags nnn instead
                                                Message 23 of 27 , Nov 9, 1995
                                                • 0 Attachment
                                                  Date: Fri, 3 Nov 95 21:02:21 +0100
                                                  From: Olle Jarnefors <ojarnef@...>

                                                  One thing that I don't understand is why you insist on 3-letter
                                                  tags nnn instead of 7-letter tags I-S-nnn for Ethnologue-based
                                                  language labels.

                                                  I would accept the I-S-nnn form, and would encourage the folks working
                                                  on the new ISO 3 char standard to accept as an input document the existing
                                                  Ethnologue list.

                                                  Regards,
                                                  Glenn
                                                • Ned Freed
                                                  ... The language tag document came out of that group. I believe that means that discussion of revisions to that document have to occur there unless you can
                                                  Message 24 of 27 , Nov 9, 1995
                                                  • 0 Attachment
                                                    > I'd be happy to move this discussion over to the MAILEXT WG, though I'm
                                                    > not entirely convinced that is the pertinent forum for it to occur. At
                                                    > least in the sense that the issue of language tags extends to all application
                                                    > areas, not simply mail.

                                                    The language tag document came out of that group. I believe that means that
                                                    discussion of revisions to that document have to occur there unless you can
                                                    convince the area directors that a change of venue is appropriate. You're
                                                    welcome to try that if you like.

                                                    Ned
                                                  Your message has been successfully submitted and would be delivered to recipients shortly.