Loading ...
Sorry, an error occurred while loading the content.
 

dcfldd and bad sectors

Expand Messages
  • rossetoecioccolato
    The following question originally arose over on another list: Open source tools such as DCFLdd v1.3.4-1 can usually recover all data, with exception of the
    Message 1 of 2 , May 23, 2008
      The following question originally arose over on another list:

      "Open source tools such as DCFLdd v1.3.4-1 can usually recover all
      data, with exception of the physically damaged sectors. (It is
      important that DCFLdd v1.3.4-1 be installed on a FreeBSD operating
      system. Studies have shown that the same program installed on a Linux
      system produces extra 'bad sectors', resulting in the loss of
      information that is actually available.)"

      It has since been picked up by a second list that is not public, and
      since I think that the question is of general interest, I will repost
      my reply to the second list here so that it may benefit a broader
      audience:

      There is a related discussion over on the sleuthkit-users mailing
      list:
      http://www.nabble.com/Best-practices-in-dealing-with-bad-blocks-and-
      hashes.-td16594673.html.

      I can't speak specifically to DCFLdd; however, I do have some
      familiarity with DD in general. :-) GNU DD reads data in blocks of a
      specified size. If an error occurs while reading from a block and
      the 'noerror' conversion was specified, DD skips the whole block and
      a fill pattern (usually zeroes) is written to the output in place of
      the missing data. If certain data cannot be read from a drive then
      by definition you lose that data.

      Disk drives are block devices. You read data from a block device in
      blocks; either you get the whole block or you get nothing. For disk
      devices the block size is equal to the sector size. Historically,
      most disk devices have used a block size of 512 bytes. Some high
      capacity drives now use a block size of 4096 bytes.

      As long as you use a DD block size that is equal to the device sector
      size (e.g. 512 bytes) all is good. You shouldn't have any lost
      sectors. The problem is that reading from a drive 512 bytes at a
      time is REALLY SLOW. A larger block size is preferred for
      performance reasons. But DD always skips an entire block based on
      DD's (not the device) block size. If a DD's block size is larger
      than the device sector size then some usable data may be lost.

      The default block size for GNU DD is 512 bytes. The obsolete FAU-DD
      (still available with Helix) uses a default block size of 4096
      bytes. The current supported FAU-DD (available from
      http://www.gmgsystemsinc.com/fau/) uses a default block size of 1
      MiB, and a block size of 5 MiB or more is recommended for static
      (not "live") acquisitions.

      Perhaps a related problem is that different operating systems read
      from drives in different multiples of the device sector size.
      Microsoft Windows reads from disk drives in cluster sized units (= 4
      x 512 = 4096). So there is the potential that some data may be lost
      here, as well as at the application level. Different *nix systems
      may use different algorithms. You really have to test the specific
      *nix distribution that you are using.

      In my experience MS Windows correctly handles "bad blocks" on disk
      devices notwithstanding its use of cluster-sized read units. Of
      course that could change with the next release of MS Windows. It
      also might not be true with different storage architectures (e.g.
      flash drives) or devices that use a non-MS device driver. For this
      reason we need to constantly test and re-test.

      The current released version of FAU-DD (available from
      http://www.gmgsystemsinc.com/fau/) uses a slightly different
      algorithm from GNU-DD in that it able to use a relatively large
      default block size (1-5 MiB or more) for performance, but will drop
      down to the device sector size (usually 512 bytes) when it encounters
      a "bad block." Then the larger block size is resumed once the "bad
      block" has been passed. Now you no longer need to choose between
      performance and reliability when using DD to image a drive. The
      current released version of FAU-DD is available exclusively from GMG
      Systems, Inc.

      One final problem is that the data read from a failing drive actually
      may change from one acquisition to another. If you encounter a "bad
      block" that means that the error rate has overwhelmed the error
      correction algorithm in use by the drive. A disk drive is not a
      paper document. If a drive actually yields different data each time
      it is read is that an acquisition "error." Or have you accurately
      acquired the contents of the drive at that particular moment in
      time. Perhaps you have as many originals as acquired "images."
      Maybe it is a question of semantics, but it is a semantic that goes
      to the heart of DIGITAL forensics.

      Remember that hashes do not guarantee that an "image" is accurate.
      They prove that it has not changed since it was acquired.

      Regards,

      ReC.
    • rossetoecioccolato
      ... [...] ... DDrescue with -d uses the O_DIRECT flag (unbuffered IO) that was introduced with Linux 2.4 kernels (google for O_DIRECT ): int do_rescue() {
      Message 2 of 2 , May 23, 2008
        BGrundy wrote:

        > The bs=512 option has no effect on this.
        [...]
        > The test drive had 4 bad sectors. All the Linux based dd
        > tools missed between 200 and 232 sectors. Other tools
        > missed just the 4. When the dd commands were run with
        > the device associated with /dev/raw, they correctly
        > reported 4 bad sectors. The only Linux util that did
        > not suffer the problem was GNU ddrescue (not dd_rescue),
        > as long as the -d flag was used ("direct access" - like
        > using /dev/raw).
        >
        > I really think this has to do with kernel caching. Hence
        > the correct output with /dev/raw. Just a theory. <

        DDrescue with '-d' uses the O_DIRECT flag (unbuffered IO) that was
        introduced with Linux 2.4 kernels (google for "O_DIRECT"):

        int do_rescue()
        {
        const int ides =
        open( iname, O_RDONLY | o_direct );
        [..]
        }

        Most modern operating systems use "buffered" disk IO by default. In
        essence, the operating system reads from the drive using a default
        algorithm and then the application reads from the OS buffer rather
        than directly from the hardware. The DD block size (bs=512) has no
        effect on how data actually is read when using buffered IO. The
        operating system uses its own algorithm which, typically involves
        reading more than one sector at at time for performance reasons.

        Contemporary operating systems also permit you to override the
        default behavior by specifying a flag such as O_DIRECT or
        FILE_FLAG_NO_BUFFERING on Windows. With "direct" or "unbuffered" IO
        data is read directly into the application buffer exactly as it is
        requested (bs=512 affects how data actually is read from the
        drive).

        Then there are the design decisions made by drive manufacturers. For
        the drives that I tested, if you request 4 sectors and 1 of the 4
        sectors is bad then you will successfully read 0 sectors. If you
        request each of the 4 sectors one at a time then you will get 3
        sectors and fail to read 1 sector. But a different drive
        manufacturer or architecture (e.g. a flash drive) could implement
        things differently.

        So what you have is a complex interaction between hardware (disk
        drive), OS and application design decisions. Buffered IO greatly
        simplifies things for application developers. But then you have to
        live with the default OS algorithm which is usually optimized for
        performance. Direct IO improves performance and provides greater
        control over how data is read from the drive; but there are special
        rules for read access that are imposed by the limitations of the
        underlying hardware.

        DD was written before the modern buffered vs. unbuffered IO
        distinction. I rather suspect that it predates the advent of
        buffered IO, if there is someone around who is able to remember back
        that far. Adapting DD to use unbuffered (direct) IO was no simple
        task, not the least of which is because DD is supposed also to be
        able to read regular files which may be encrypted or compressed or
        sparse. That is one reason why we chose to rewrite the current
        released version of FAU-DD starting from scratch.

        Using 'comp=noerror' (or 'comp=noerror,sync' on *nix) is the correct
        algorithm when properly implemented. But what is "proper" could
        change with the next hot fix or service pack or generation of
        drives. So we need to constantly test and retest. Thanks for taking
        the time to test this. Your efforts will benefit the entire
        community.

        Regards,

        ReC.
      Your message has been successfully submitted and would be delivered to recipients shortly.