Loading ...
Sorry, an error occurred while loading the content.

[agileDatabases] RE: Bit or Boolean has no place in SQL database (was: Changing wide unnormalised table to tall thin tables - performance ? - HELP)

Expand Messages
  • Richard Quinn
    Help! Mr. Celko has well and truly mauled me! Pounced and ripped me to shreds. And I thought I was so clever! Anyway: It is smart to argue with a genius;
    Message 1 of 25 , Aug 13, 2003
    • 0 Attachment
      Help!

      Mr. Celko has well and truly mauled me! Pounced and ripped me to shreds.
      And I thought I was so clever! Anyway: "It is smart to argue with a
      genius; listeners can't tell which is which." :))

      But it is true, I didn't know that ISO had standardized GENDER (maybe
      someday they *will* get around to standardizing sex, there is certainly
      too much variety around :) ). Can't wait to hear the rant about DBAs who
      fail to use industry standard codes :)

      But to answer your questions, I don't hate 1FN! I love it! Nearly as
      much as I love 2NF! Which is nearly as loveable as 3NF! And so on... But
      let's see:

      ObjectID SubjectID Permission (CHAR)
      ------ ------- ---------
      2 103 read
      2 103 write
      2 103 exec
      ... ... . . .

      Compared with
      ObjectID SubjectID Permission(Byte)
      ------ ------- ---------
      2 103 7

      I agree, the 2nd example is not in 1NF (it is in "hand-optimized 0NF"
      lol) so building sets and ranges may become quite difficult. But (and as
      long as I'm running on x86 architctures it remains true) I can read
      every state of the subject <has permission on> object relationship in
      one fell swoop on the heap or the index, obviating the need for most
      sets. Same goes for deleting all values or for updating the state. In
      the systems that I've helped push off the cliff, with millions of
      objects having one of many possible states on hundreds of subjects, we
      have been able to realize *serious* reduction in concurrency contention
      and big performance gains. We did think about it, we even argued a bit
      too, but the need for speed plus a willingness to sacrifice some of
      Codd's principles helped us to quickly make a very fast system. (And I
      have to point out, we actually do do quite a bit of normalization, too
      much for some people on this list perhaps).

      BTW, there are no !illegal! Combinations here, that would be quite a
      problem with a byte datatype. Or any other metadata describing the
      acceptable state of an attribute :))

      So, in summing up, it may not be normalized, but it sure as hell is fast
      whilst remaining thought out. And by the time my customers get around to
      wanting to count the number of people with permission x or y and z on an
      action, they've already changed the data so often the question looses
      meaning. Its just my evolutionary response to an ever more aggressive
      predator...

      - Richard







      >>
      >> A classic example: I have a field in my PERSON table called
      IsMale (BIT). When set to true it informs me that the person is a
      male. <<

      This is a clasic example, but not how you think. What is the
      **attribute** we are measuring? Sex (or gender)! What is the
      domain of values for this atribute? The ISO sex codes, of course:

      0= Unknown
      1= Male
      1= Female
      1= N/A (lawful persons, such as corporations)

      Now, anyone who uses ISO Standard can share my data. No experienced
      DBA would fail to use industry standard codes, while newbies today
      are so @&$%! lazy or stupid, they never think of looking for one.
      But that is another rant of mine.

      >> The attribute can not possibly be logically or physically in any
      other state, but the information stored is just as valid as that
      stored in any other field [sic]. <<

      That is a major problem -- the domain for a BIT is not the domain
      for sex. The name "Ismale" tells you this is *not* an attribute,
      but a predicate, a shorthand for (foobar.sex = 1) in ISO codes,
      which is the state in the database of a column within a row.

      >> I see that the third state of "UNKNOWN" is not quite the same
      state as "NULL", but who cares? <<

      ISO cares. Anyone who uses ISO standards cares.

      This is very subtle, like the difference betweeen a number and a
      numeral, or predicate calculus and first-order logic. Quantifiers
      are at a different logical level than simple predicates.

      >> On another note, and I can tell you're gooing to hate this one :)
      I occasionally use BYTE fields, especially when storing things like
      permissions in an ACL kind of table. 0=no rights, 1=read, 2=write,
      4=execute, 8=delete, 16=take ownership. <<

      I don't mind -- I made a six-digit income fixing this assembly
      language coding :). Wait until you move from a high-end to a low-
      end machine or change word size!! Why do you hate first normal form
      so much that you drop back to 1950's machine code and all the
      complexity in the application code? Where are your constraints to
      prevent illegal combinations? Etc.

      You can get away with numeric codes unless you try to actually pass
      it as a BYTE without conversion to the appropriate host langauge
      datatype.

      >> So if the value in there is a 12 it is easy to compute what
      permissions the subject has on an object (exec and del). It means
      that I can store - in a single row - much more densely packed info.
      I think that this fits in quite nicely with some Agile principles.<<

      The "code first, think later, normalize never!" principle? I see a
      lot of that with agile programming; we used to call it a kludge
      or "Wright Brothers system engineering" (put it all together, push
      it off a cliff, see if it flies and try again ).

      <<
    • Bayley, Alistair
      ... Sorry. Just read the SQL Server docs. Bit is a bit special: you can t store nulls (and it s not indexable). AFAICT it s the only SQL Server datatype that
      Message 2 of 25 , Aug 14, 2003
      • 0 Attachment
        > From: Joe Celko [mailto:celko@...]
        >
        > 2) A BIT must be either '1' or '0' and absolutely, positively,
        > never, never anything else. These guys live in physical hardware
        > and have to obey this law.
        >
        > You can argue for breaking one of these "laws of the Universe" or
        > you can simply say that BIT cannot belong in SQL because it is a low
        > level, physical implementation concept.

        Sorry. Just read the SQL Server docs. Bit is a bit special: you can't store
        nulls (and it's not indexable). AFAICT it's the only SQL Server datatype
        that doesn't allow nulls. How odd...


        > From: Richard Quinn [mailto:rquinn@...]
        >
        > But it is true, I didn't know that ISO had standardized GENDER (maybe
        > someday they *will* get around to standardizing sex, there is
        > certainly too much variety around :) ).

        If you're recording data about whether a person is Male, Female, or
        Something Else, then it's probably more correct to call it Sex. See:
        http://www.iath.virginia.edu/~umw8f/Barbarians/Essays/gender.html

        Sex: "Biological sex is obvious -- a person is either male or female." <--
        hermaphrodites?

        Gender: "Gender is a cultural construct, part of the way an individual
        defines him/herself and is defined by others in the society. Gender is
        connected only in part to biological sex and issues of sexuality and
        reproduction; to a great extent, we study gender in the context of economic
        and power relations, of production and class, of ritual, belief and
        ideology."

        See also:
        http://www.bartleby.com/64/C005/010.html

        Gender was a linguistic term, but has been borrowed by fields outside of
        linguistics to refer to sex-based categories. It's meaning has become
        ambiguous.


        *****************************************************************
        The information in this email and in any attachments is
        confidential and intended solely for the attention and use
        of the named addressee(s). This information may be
        subject to legal professional or other privilege or may
        otherwise be protected by work product immunity or other
        legal rules. It must not be disclosed to any person without
        our authority.

        If you are not the intended recipient, or a person
        responsible for delivering it to the intended recipient, you
        are not authorised to and must not disclose, copy,
        distribute, or retain this message or any part of it.
        *****************************************************************
      • Scott W. Ambler
        Seems to me you might want to consider two fields: 1. Boolean: Indicator whether or not you know the value. 2. Char(1): M=Male, F=Female, A=Androgynous (that s
        Message 3 of 25 , Aug 14, 2003
        • 0 Attachment
          Seems to me you might want to consider two fields:

          1. Boolean: Indicator whether or not you know the value.
          2. Char(1): M=Male, F=Female, A=Androgynous (that's a valid value in some
          domains), NULL if unknown.

          Or you could just go with one field, the second, and add U for unknown as
          the previous poster suggested.

          I'm a firm believer in following standards, see
          www.agilemodeling.com/practices.htm#ApplyModelingStandards, when they're
          applicable. I'm not sure the ISO standard has it covered, but that would
          depend on the actual requirements that you're trying to fulfill, wouldn't it?

          - Scott

          At 10:02 AM 8/13/2003 +0300, you wrote:
          >Hello Richard,
          >
          > >A classic example: I have a field in my PERSON table callied IsMale
          > >(BIT). When set to true it informs me that the person is a male. If NULL
          > >I can't say. False == female. The attribute can not possibly be
          >
          >In my opinion it's more correct for your field to be called SEX.
          >So you will have values: Male, Female, Unknown, NULL which is perfectly
          >correct to be coded using numeric values (except NULL ofcourse).
          >
          >--
          >Best regards,
          > Alexander mailto:saho@...

          ====================================================
          Scott W. Ambler
          Senior Consultant, Ronin International, Inc.
          www.ronin-intl.com/company/scottAmbler.html

          www.agiledata.org
          www.agilemodeling.com
          www.ambysoft.com
          www.enterpriseunifiedprocess.info
          www.modelingstyle.info
          www.ronin-intl.com
        • Alex Weatherall
          ... can t store ... datatype ... Err... according to Books Online, the BIT type is NULLABLE (I tried this and you can set a bit column to NULL)
          Message 4 of 25 , Aug 15, 2003
          • 0 Attachment
            > Sorry. Just read the SQL Server docs. Bit is a bit special: you
            can't store
            > nulls (and it's not indexable). AFAICT it's the only SQL Server
            datatype
            > that doesn't allow nulls. How odd...

            Err... according to Books Online, the BIT type is NULLABLE (I tried
            this and you can set a bit column to NULL)
            <MS SQLSERVER Books Online>
            bit
            Integer data type 1, 0, or NULL.

            Remarks
            Columns of type bit cannot have indexes on them.

            Microsoft® SQL Server™ optimizes the storage used for bit columns. If
            there are 8 or fewer bit columns in a table, the columns are stored
            as 1 byte. If there are from 9 through 16 bit columns, they are
            stored as 2 bytes, and so on.
            </MS SQLSERVER Books Online>

            Obviously setting an index on this column wouldn't be a good idea as
            there are only two (not including null) values to index. That would
            be like writing a book in binary and then having an index at the back
            with all the pages with 1's in and all the pages with 0's in. The
            index would be larger than the data!
          • Bayley, Alistair
            ... Oh, I see. I looked at the 6.5 docs first. The docs for SQL Server 2000 state that nulls are allowed. They must have changed it sometime after 6.5.
            Message 5 of 25 , Aug 15, 2003
            • 0 Attachment
              > From: Alex Weatherall [mailto:alexweatherall@...]
              >
              > Err... according to Books Online, the BIT type is NULLABLE (I tried
              > this and you can set a bit column to NULL)
              > <MS SQLSERVER Books Online>
              > bit
              > Integer data type 1, 0, or NULL.


              Oh, I see. I looked at the 6.5 docs first. The docs for SQL Server 2000
              state that nulls are allowed. They must have changed it sometime after 6.5.


              *****************************************************************
              The information in this email and in any attachments is
              confidential and intended solely for the attention and use
              of the named addressee(s). This information may be
              subject to legal professional or other privilege or may
              otherwise be protected by work product immunity or other
              legal rules. It must not be disclosed to any person without
              our authority.

              If you are not the intended recipient, or a person
              responsible for delivering it to the intended recipient, you
              are not authorised to and must not disclose, copy,
              distribute, or retain this message or any part of it.
              *****************************************************************
            • Joe Celko
              ... sometime after 6.5.
              Message 6 of 25 , Aug 15, 2003
              • 0 Attachment
                >> 6.5 docs ... SQL Server 2000 .. They must have changed it
                sometime after 6.5. <<

                Yet another problem with using proprietary datatypes and working at
                the machine hardware level; the vendor gets to make up the rules as
                they go along and you cannot appeal to ANSI or ISO. I sure hope
                that you like an Intel architecture, because you can be locked into
                it for awhile.
              • Eytan Ben-Meir
                Hi, This is a general question. I m interested in peoples experience & insight. Have you encountered a situation in your practice where: 1) Amount of data was
                Message 7 of 25 , Aug 15, 2003
                • 0 Attachment
                  Hi,
                   
                  This is a general question.
                  I'm interested in peoples experience & insight.
                  Have you encountered a situation in your practice where:
                  1) Amount of data was very large (much larger - at least one order of magnitude - than available RAM).
                  2) Requirements included flexibile schema
                       - Need to add(/remove) attributes to entities after deployment
                       - Need to add(/remove)  relationships between existing entities
                       - Need to add(/remove) new entities with relationships to existing (old) entities
                  ???
                  3) If so, what are the patterns for coping?
                   
                  Thanks,
                   
                  Eytan Ben-Meir
                   
                     
                  -----Original Message-----
                  From: Joe Celko [mailto:celko@...]
                  Sent: Fri, August 15, 2003 8:12 AM
                  To: agileDatabases@yahoogroups.com
                  Subject: [agileDatabases] Re: Changing wide unnormalised table to tall thin tables - performance ? - HELP

                  >> 6.5 docs ... SQL Server 2000 .. They must have changed it
                  sometime after 6.5. <<

                  Yet another problem with using proprietary datatypes and working at
                  the machine hardware level; the vendor gets to make up the rules as
                  they go along and you cannot appeal to ANSI or ISO.  I sure hope
                  that you like an Intel architecture, because you can be locked into
                  it for awhile. 





                  To unsubscribe from this group, send an email to:
                  agileDatabases-unsubscribe@yahoogroups.com



                  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
                • Simon Larsen
                  Actually I think that was mssql 7 and earlier. 2k does allow nulls (don t remember how I know this but it has a memory of bad associated with it, in that I
                  Message 8 of 25 , Aug 17, 2003
                  • 0 Attachment
                    Actually I think that was mssql 7 and earlier. 2k does allow nulls
                    (don't remember how I know this but it has a memory of "bad" associated
                    with it, in that I think I got myself in some difficulty using them that
                    way when I first came to 2K).

                    From BOL: "If the setting of sp_dbcmptlevel is 65 or lower, bit data
                    types default to NOT NULL if the column does not have an explicit NULL
                    or NOT NULL."

                    Although you are correct that it is not indexable, although why anyone
                    would do that is beyond me, even with nullable you only have 3 possible
                    values.

                    sl


                    From: Bayley, Alistair [mailto:alistair_bayley@...]

                    Sorry. Just read the SQL Server docs. Bit is a bit special: you can't
                    store nulls (and it's not indexable). AFAICT it's the only SQL Server
                    datatype that doesn't allow nulls. How odd...
                  Your message has been successfully submitted and would be delivered to recipients shortly.