Loading ...
Sorry, an error occurred while loading the content.

Re: [solarisx86] SATA Hardware Raid - recommended cards

Expand Messages
  • Henk Langeveld
    Hi Chad, ... Correct. Up to a certain level. ... (Some of them even use Sun in their dev labs, I ve been told.) ... Granted. So the only remaining essential
    Message 1 of 24 , Dec 1, 2006
    View Source
    • 0 Attachment
      Hi Chad,

      me:
      >> The problem does not need to be in the RAID s/w. It could be
      >> something that the RAID firmware cannot see, assuming it is perfect.

      Chad Leigh:
      > The same issues exist for JBOD controllers and HD in general.

      >> * ZFS does more than check disk read/write errors. h/w read cannot do
      >> more than that.

      > Any checks that ZFS makes could also be done in HW/FW. FW is just SW
      > burned into a ROM/flash memory or similar. HW can implement the same
      > algorithms as SW.

      Correct. Up to a certain level.

      >> * Solaris engineering has some pretty good development processes which
      >> itsels is well documented and open for review. What do you know of
      >> the h/w raid development? which one?

      > Long established RAID development companies probably have similar
      > development processes to what SUN uses. I am not talking about a
      > guy in a garage. (And not specifically about any particular RAID).

      (Some of them even use Sun in their dev labs, I've been told.)

      > I also worked at DEC and even good development processes do not
      > eliminate bugs and problems. We see on the ZFS list all the issue
      > and bugs in ZFS and the hard work to overcome them. I am not
      > knocking the Solaris / ZFS team. I wouldn't be here if I was trying
      > to knock them -- I would just go elsewhere. I am saying that ZFS is
      > not a panacea and the criticisms of HW raid solutions could just as
      > easily apply to Solaris or drivers or ZFS itself and to the commodity
      > HW driving the JBOD arrays.

      Granted.

      So the only remaining essential difference between ZFS and other systems is
      that it acts as the sole interface between application and physical storage,
      and performs its own end-to-end integrity check on that.

      ZFS is, to my knowledge, the first Unix file system implementing something similar
      to TCP/IP.

      In UFS, we hand of individual blocks and trust that what we receive is what
      was sent out in the first place, relying on the sub-system.

      In systems programming, we always making assumptions. Interfaces are defined
      and define a contract. Layer A tells layer B to do X, layer B responds with
      "Yes, done" or "Nope, couldn't do that, because ...". Layer A has to take
      that response at face value. If layer B says "Done", you have to assume it did.

      The file system is the highest layer in the system between an application and its
      data. It's the last opportunity for the system to perform any validation of that
      data. I know that zfs performs that end-to-end check.

      Storage vendors may do similar things, but their hand off is by definition at a
      lower layer of abstraction. Do the bigger SAN vendor(s) implement end to
      end control on the block level in their drivers, catching possible hba/dma
      issues at the host level?

      NAS vendors have the mixed blessing of a known file system interface, providing
      a higher level hand off, but limited to the features of the particular network
      file system interface, which may not be much (CIFS, NFS).

      ZFS and TCP have in common that they take end-to-end responsibility, and do not
      rely on a chain of subsystems each performing their own (valid) checks, but add
      an overall check between A and Z, protecting against subtly broken memory, ports,
      drivers, etc. You may know that you have top-quality equipment, but the file
      system does not.

      Cheers,
      Henk
    • Chad Leigh -- Shire.Net LLC
      ... Yes, but the argument was not against ZFS but that ZFS proves that the RAID is screwing up, when it does not prove this. The bug could be a SW bug in the
      Message 2 of 24 , Dec 1, 2006
      View Source
      • 0 Attachment
        On Dec 1, 2006, at 2:43 AM, Henk Langeveld wrote:

        > ZFS and TCP have in common that they take end-to-end
        > responsibility, and do not
        > rely on a chain of subsystems each performing their own (valid)
        > checks, but add
        > an overall check between A and Z, protecting against subtly broken
        > memory, ports,
        > drivers, etc. You may know that you have top-quality equipment,
        > but the file
        > system does not.

        Yes, but the argument was not against ZFS but that ZFS proves that
        the RAID is screwing up, when it does not prove this. The bug could
        be a SW bug in the kernel, or a driver, or even in ZFS itself. I was
        complaining about the idea that all problems are automatically in the
        RAID system when ZFS has an issue and not somewhere else. This is a
        false accusation unless you can show proof.

        ZFS has the same problem in a JBOD system with a faulty driver as it
        does with a RAID system with a faulty driver. And ZFS may expose a
        problem that exists somewhere (which could be in ZFS or in the kernel
        or a driver or in the HW or RAID itself) but the problem is not
        pinpointed by ZFS.

        Chad

        ---
        Chad Leigh -- Shire.Net LLC
        Your Web App and Email hosting provider
        chad at shire.net





        [Non-text portions of this message have been removed]
      • Henk Langeveld
        ... Now I see your point. Indeed, ZFS can highlight trouble in some other layer, but it cannot tell you where. What it is especially good at is in identifying
        Message 3 of 24 , Dec 1, 2006
        View Source
        • 0 Attachment
          Chad Leigh -- Shire.Net LLC wrote:
          > ZFS has the same problem in a JBOD system with a faulty driver as it
          > does with a RAID system with a faulty driver. And ZFS may expose a
          > problem that exists somewhere (which could be in ZFS or in the kernel
          > or a driver or in the HW or RAID itself) but the problem is not
          > pinpointed by ZFS.

          Now I see your point. Indeed, ZFS can highlight trouble in some other
          layer, but it cannot tell you where.

          What it is especially good at is in identifying subtle errors where each
          subsystem correctly but blindly does as it is told.

          Troubleshooting still requires traditional trial by elimination.

          I remember I was once called out to troubleshoot an unpredictable data corruption
          issue - the customer was not even able to reproduce it. Months later we found
          that a scsi drive's firmware had a bug which resulted in data corruption.
          It would say that a block was on disk when it wasn't.

          It was eventually discovered by a db platform that verified its writes by flushing
          the cache and rereading the data from the platter. ZFS would have caught this.

          Cheers,
          Henk
        • maybird1776
          ... broken ... could ... was ... the ... a ... it ... a ... kernel ... There are some ZFS updates coming out for SXCR b53 (and Solaris 10u3) that you may want
          Message 4 of 24 , Dec 1, 2006
          View Source
          • 0 Attachment
            --- In solarisx86@yahoogroups.com, "Chad Leigh -- Shire.Net LLC"
            <chad@...> wrote:
            >
            >
            > On Dec 1, 2006, at 2:43 AM, Henk Langeveld wrote:
            >
            > > ZFS and TCP have in common that they take end-to-end
            > > responsibility, and do not
            > > rely on a chain of subsystems each performing their own (valid)
            > > checks, but add
            > > an overall check between A and Z, protecting against subtly
            broken
            > > memory, ports,
            > > drivers, etc. You may know that you have top-quality equipment,
            > > but the file
            > > system does not.
            >
            > Yes, but the argument was not against ZFS but that ZFS proves that
            > the RAID is screwing up, when it does not prove this. The bug
            could
            > be a SW bug in the kernel, or a driver, or even in ZFS itself. I
            was
            > complaining about the idea that all problems are automatically in
            the
            > RAID system when ZFS has an issue and not somewhere else. This is
            a
            > false accusation unless you can show proof.
            >
            > ZFS has the same problem in a JBOD system with a faulty driver as
            it
            > does with a RAID system with a faulty driver. And ZFS may expose
            a
            > problem that exists somewhere (which could be in ZFS or in the
            kernel
            > or a driver or in the HW or RAID itself) but the problem is not
            > pinpointed by ZFS.
            >
            > Chad
            >
            > ---
            > Chad Leigh -- Shire.Net LLC
            > Your Web App and Email hosting provider
            > chad at shire.net
            >
            >

            There are some ZFS updates coming out for SXCR b53 (and Solaris 10u3)
            that you may want to check out.

            You have a valid point in that RAID is moreso affected by the quality
            of RAID device drivers **AND** hardware. I've seen a few issues in
            which the Southbridge chipset and RAID controller integration caused
            more issues than the device driver itself.

            As for ZFS, you may want to look into what is supported for RAID
            configurations by the various device drivers (i.e. SATA (RAID)
            framework) being used.

            Ken Mays
            EarthLink, Inc.
          • rogerfujii
            ... Not sure if I d have too much confidence in the U2 driver - the 3132 I have doesn t function with it (though it seemed to be fine with sx when I had a
            Message 5 of 24 , Dec 3, 2006
            View Source
            • 0 Attachment
              --- In solarisx86@yahoogroups.com, Al Hopper <al@...> wrote:

              > The SiliconImage 3124 is a *Sun* developed/supported driver and ships
              > standard in the current release of Solaris 06/06 (aka Update 2).

              Not sure if I'd have too much confidence in the U2 driver - the 3132 I
              have doesn't function with it (though it seemed to be fine with sx
              when I had a quick chance to test it with it).

              > A user reported success with the following 4-port board based on the
              > SiliconImage 3124 chipset (details in the archives):
              > http://www.newegg.com/Product/Product.asp?Item=N82E16815124020
              >
              > Cost is $20.

              but this is not using the si3124 driver. The 3114 seems to use the
              pci-ide driver
              http://forum.sun.com/jive/thread.jspa?threadID=94846&messageID=330708

              -r
            • Wes Williams
              ... For what it s worth, I have a Syba SD-SATA2-2E2I (SI3124 chipset), with PCI-X 133MHz 64-bit support, due to arrive on Thursday.
              Message 6 of 24 , Dec 4, 2006
              View Source
              • 0 Attachment
                --- In solarisx86@yahoogroups.com, "rogerfujii" <rmf@...> wrote:
                >
                > --- In solarisx86@yahoogroups.com, Al Hopper <al@> wrote:
                >
                > > The SiliconImage 3124 is a *Sun* developed/supported driver and ships
                > > standard in the current release of Solaris 06/06 (aka Update 2).
                >
                > Not sure if I'd have too much confidence in the U2 driver - the 3132 I
                > have doesn't function with it (though it seemed to be fine with sx
                > when I had a quick chance to test it with it).
                >
                > > A user reported success with the following 4-port board based on the
                > > SiliconImage 3124 chipset (details in the archives):
                > > http://www.newegg.com/Product/Product.asp?Item=N82E16815124020
                > >
                > > Cost is $20.
                >
                > but this is not using the si3124 driver. The 3114 seems to use the
                > pci-ide driver
                > http://forum.sun.com/jive/thread.jspa?threadID=94846&messageID=330708
                >
                > -r
                >

                For what it's worth, I have a Syba SD-SATA2-2E2I (SI3124 chipset),
                with PCI-X 133MHz 64-bit support, due to arrive on Thursday.
                http://www.newegg.com/Product/Product.asp?Item=N82E16816124003
                http://www.syba.com/product/43/02/08/index.html

                My intent is to use this card for SATA 3.0 (NCQ!) instead of SATA 1.5
                in my W1100z file/web server. This W1100z has a ZFS mirror of two new
                400Gb Seagate Barracuda ES series drives, current OS is Solaris
                Express 11/06 though.

                Let me know if there is anything here that can help.
              Your message has been successfully submitted and would be delivered to recipients shortly.