Loading ...
Sorry, an error occurred while loading the content.

USB RAID - sdb keeps failing

Expand Messages
  • biodiesel_bri
    I can t seem to get my second HD to work on this raid array. I tested the HD using the WD utility many times and got no errors. Anyone have any idea why this
    Message 1 of 3 , Jan 18, 2011
    • 0 Attachment
      I can't seem to get my second HD to work on this raid array. I tested the HD using the WD utility many times and got no errors. Anyone have any idea why this keeps happening or a way to fix it?

      The drives are identical WD Scorpio Blue 500gb drives and identical USB enclosures. I have the same setup on /two/ other slugs and it works fine.

      /var/log/messages:
      Jan 19 05:07:44 (none) user.notice kernel: sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
      Jan 19 05:07:44 (none) user.notice kernel: sd 1:0:0:0: [sdb] Write Protect is off
      Jan 19 05:07:44 (none) user.debug kernel: sd 1:0:0:0: [sdb] Mode Sense: 23 00 00 00
      Jan 19 05:07:44 (none) user.err kernel: sd 1:0:0:0: [sdb] Assuming drive cache: write through
      Jan 19 05:07:44 (none) user.info kernel: sdb: sdb1
      Jan 19 05:07:45 (none) user.err kernel: FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
      Jan 19 05:07:45 (none) user.err kernel: FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
      Jan 19 05:07:45 (none) user.notice root: mount.sh/automount Not removing non-empty directory [/media/sdb1]
      Jan 19 05:08:05 (none) user.info kernel: md: bind<sdb1>
      Jan 19 05:08:05 (none) user.warn kernel: RAID1 conf printout:
      Jan 19 05:08:05 (none) user.warn kernel: --- wd:1 rd:2
      Jan 19 05:08:05 (none) user.warn kernel: disk 0, wo:0, o:1, dev:sda3
      Jan 19 05:08:05 (none) user.warn kernel: disk 1, wo:1, o:1, dev:sdb1
      Jan 19 05:08:05 (none) user.info kernel: md: recovery of RAID array md4
      Jan 19 05:08:05 (none) user.info kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
      Jan 19 05:08:05 (none) user.info kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
      Jan 19 05:08:05 (none) user.info kernel: md: using 128k window, over a total of 484367680 blocks.
      Jan 19 05:09:46 (none) user.info kernel: usb 1-2: reset high speed USB device using ehci_hcd and address 3
      Jan 19 05:09:56 (none) user.info kernel: usb 1-2: reset high speed USB device using ehci_hcd and address 3
      Jan 19 05:10:13 (none) user.info kernel: usb 1-2: reset high speed USB device using ehci_hcd and address 3
      Jan 19 05:10:13 (none) user.info kernel: usb 1-2: reset high speed USB device using ehci_hcd and address 3
      Jan 19 05:10:23 (none) user.info kernel: usb 1-2: reset high speed USB device using ehci_hcd and address 3
      Jan 19 05:10:24 (none) user.info kernel: sd 1:0:0:0: Device offlined - not ready after error recovery
      Jan 19 05:10:24 (none) user.info kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x05 driverbyte=0x00
      Jan 19 05:10:24 (none) user.err kernel: end_request: I/O error, dev sdb, sector 727487
      Jan 19 05:10:24 (none) user.alert kernel: raid1: Disk failure on sdb1, disabling device.
      Jan 19 05:10:24 (none) user.alert kernel: raid1: Operation continuing on 1 devices.
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.err kernel: sd 1:0:0:0: rejecting I/O to offline device
      Jan 19 05:10:24 (none) user.info kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
      Jan 19 05:10:24 (none) user.err kernel: end_request: I/O error, dev sdb, sector 727615
      Jan 19 05:10:24 (none) user.info kernel: md: md4: recovery done.
      Jan 19 05:10:24 (none) user.warn kernel: RAID1 conf printout:
      Jan 19 05:10:24 (none) user.warn kernel: --- wd:1 rd:2
      Jan 19 05:10:24 (none) user.warn kernel: disk 0, wo:0, o:1, dev:sda3
      Jan 19 05:10:24 (none) user.warn kernel: disk 1, wo:1, o:0, dev:sdb1
      Jan 19 05:10:24 (none) user.warn kernel: RAID1 conf printout:
      Jan 19 05:10:24 (none) user.warn kernel: --- wd:1 rd:2
      Jan 19 05:10:24 (none) user.warn kernel: disk 0, wo:0, o:1, dev:sda3
    • Mike Westerhof (mwester)
      ... Looks like a failed hard drive. The vendor-provided utilities are often inadequate; in order to test it properly, you ll have to take the HDD itself out of
      Message 2 of 3 , Jan 19, 2011
      • 0 Attachment
        On 1/18/2011 11:39 PM, biodiesel_bri wrote:
        > I can't seem to get my second HD to work on this raid array. I tested the HD using the WD utility many times and got no errors. Anyone have any idea why this keeps happening or a way to fix it?
        >
        > The drives are identical WD Scorpio Blue 500gb drives and identical USB enclosures. I have the same setup on /two/ other slugs and it works fine.

        Looks like a failed hard drive.

        The vendor-provided utilities are often inadequate; in order to test it
        properly, you'll have to take the HDD itself out of the enclosure, put
        it on an IDE or SATA cable, and run a full disk diagnostic suite on it.

        In the meantime, what happens if you repartition the drive (with fdisk),
        put a filesystem on it with e2mkfs, and then run fsck with the option to
        do full bad-block checking? You should be able to do that in the USB
        enclosure, although its doubtful that the NSLU2 has the memory to
        complete the fsck operation without extra swapspace -- best do this on a
        real host.

        -Mike (mwester)
      • Bill
        My experience has been that software raid does not work reliably on NSLU2 s for hard drive. The reason being is software raid requires a significant amount
        Message 3 of 3 , Jan 19, 2011
        • 0 Attachment
          My experience has been that software raid does not work reliably on NSLU2's for hard drive. The reason being is software raid requires a significant amount of non-swapable memory. You don't have that much memory on an NSLU2 to begin with. Swap does wonders, but when you only have a small amount of physical memory sometimes there is not even enough available to use swap. In which case the kernel will randomly kill processes or halt (depending on your settings) eventually.

          Even worse, try using LVM on couple of 1.5TB drives.... You'll quickly conclude for an LVM you are best off trying to just use raw partitions, and adjust your file allocation stradegy's accordingly.

          Bill

          --- In nslu2-general@yahoogroups.com, "Mike Westerhof (mwester)" <mwester@...> wrote:
          >
          > On 1/18/2011 11:39 PM, biodiesel_bri wrote:
          > > I can't seem to get my second HD to work on this raid array. I tested the HD using the WD utility many times and got no errors. Anyone have any idea why this keeps happening or a way to fix it?
          > >
          > > The drives are identical WD Scorpio Blue 500gb drives and identical USB enclosures. I have the same setup on /two/ other slugs and it works fine.
          >
          > Looks like a failed hard drive.
          >
          > The vendor-provided utilities are often inadequate; in order to test it
          > properly, you'll have to take the HDD itself out of the enclosure, put
          > it on an IDE or SATA cable, and run a full disk diagnostic suite on it.
          >
          > In the meantime, what happens if you repartition the drive (with fdisk),
          > put a filesystem on it with e2mkfs, and then run fsck with the option to
          > do full bad-block checking? You should be able to do that in the USB
          > enclosure, although its doubtful that the NSLU2 has the memory to
          > complete the fsck operation without extra swapspace -- best do this on a
          > real host.
          >
          > -Mike (mwester)
          >
        Your message has been successfully submitted and would be delivered to recipients shortly.