dcfldd and bad sectors
- The following question originally arose over on another list:
"Open source tools such as DCFLdd v1.3.4-1 can usually recover all
data, with exception of the physically damaged sectors. (It is
important that DCFLdd v1.3.4-1 be installed on a FreeBSD operating
system. Studies have shown that the same program installed on a Linux
system produces extra 'bad sectors', resulting in the loss of
information that is actually available.)"
It has since been picked up by a second list that is not public, and
since I think that the question is of general interest, I will repost
my reply to the second list here so that it may benefit a broader
There is a related discussion over on the sleuthkit-users mailing
I can't speak specifically to DCFLdd; however, I do have some
familiarity with DD in general. :-) GNU DD reads data in blocks of a
specified size. If an error occurs while reading from a block and
the 'noerror' conversion was specified, DD skips the whole block and
a fill pattern (usually zeroes) is written to the output in place of
the missing data. If certain data cannot be read from a drive then
by definition you lose that data.
Disk drives are block devices. You read data from a block device in
blocks; either you get the whole block or you get nothing. For disk
devices the block size is equal to the sector size. Historically,
most disk devices have used a block size of 512 bytes. Some high
capacity drives now use a block size of 4096 bytes.
As long as you use a DD block size that is equal to the device sector
size (e.g. 512 bytes) all is good. You shouldn't have any lost
sectors. The problem is that reading from a drive 512 bytes at a
time is REALLY SLOW. A larger block size is preferred for
performance reasons. But DD always skips an entire block based on
DD's (not the device) block size. If a DD's block size is larger
than the device sector size then some usable data may be lost.
The default block size for GNU DD is 512 bytes. The obsolete FAU-DD
(still available with Helix) uses a default block size of 4096
bytes. The current supported FAU-DD (available from
http://www.gmgsystemsinc.com/fau/) uses a default block size of 1
MiB, and a block size of 5 MiB or more is recommended for static
(not "live") acquisitions.
Perhaps a related problem is that different operating systems read
from drives in different multiples of the device sector size.
Microsoft Windows reads from disk drives in cluster sized units (= 4
x 512 = 4096). So there is the potential that some data may be lost
here, as well as at the application level. Different *nix systems
may use different algorithms. You really have to test the specific
*nix distribution that you are using.
In my experience MS Windows correctly handles "bad blocks" on disk
devices notwithstanding its use of cluster-sized read units. Of
course that could change with the next release of MS Windows. It
also might not be true with different storage architectures (e.g.
flash drives) or devices that use a non-MS device driver. For this
reason we need to constantly test and re-test.
The current released version of FAU-DD (available from
http://www.gmgsystemsinc.com/fau/) uses a slightly different
algorithm from GNU-DD in that it able to use a relatively large
default block size (1-5 MiB or more) for performance, but will drop
down to the device sector size (usually 512 bytes) when it encounters
a "bad block." Then the larger block size is resumed once the "bad
block" has been passed. Now you no longer need to choose between
performance and reliability when using DD to image a drive. The
current released version of FAU-DD is available exclusively from GMG
One final problem is that the data read from a failing drive actually
may change from one acquisition to another. If you encounter a "bad
block" that means that the error rate has overwhelmed the error
correction algorithm in use by the drive. A disk drive is not a
paper document. If a drive actually yields different data each time
it is read is that an acquisition "error." Or have you accurately
acquired the contents of the drive at that particular moment in
time. Perhaps you have as many originals as acquired "images."
Maybe it is a question of semantics, but it is a semantic that goes
to the heart of DIGITAL forensics.
Remember that hashes do not guarantee that an "image" is accurate.
They prove that it has not changed since it was acquired.
- BGrundy wrote:
> The bs=512 option has no effect on this.[...]
> The test drive had 4 bad sectors. All the Linux based ddDDrescue with '-d' uses the O_DIRECT flag (unbuffered IO) that was
> tools missed between 200 and 232 sectors. Other tools
> missed just the 4. When the dd commands were run with
> the device associated with /dev/raw, they correctly
> reported 4 bad sectors. The only Linux util that did
> not suffer the problem was GNU ddrescue (not dd_rescue),
> as long as the -d flag was used ("direct access" - like
> using /dev/raw).
> I really think this has to do with kernel caching. Hence
> the correct output with /dev/raw. Just a theory. <
introduced with Linux 2.4 kernels (google for "O_DIRECT"):
const int ides =
open( iname, O_RDONLY | o_direct );
Most modern operating systems use "buffered" disk IO by default. In
essence, the operating system reads from the drive using a default
algorithm and then the application reads from the OS buffer rather
than directly from the hardware. The DD block size (bs=512) has no
effect on how data actually is read when using buffered IO. The
operating system uses its own algorithm which, typically involves
reading more than one sector at at time for performance reasons.
Contemporary operating systems also permit you to override the
default behavior by specifying a flag such as O_DIRECT or
FILE_FLAG_NO_BUFFERING on Windows. With "direct" or "unbuffered" IO
data is read directly into the application buffer exactly as it is
requested (bs=512 affects how data actually is read from the
Then there are the design decisions made by drive manufacturers. For
the drives that I tested, if you request 4 sectors and 1 of the 4
sectors is bad then you will successfully read 0 sectors. If you
request each of the 4 sectors one at a time then you will get 3
sectors and fail to read 1 sector. But a different drive
manufacturer or architecture (e.g. a flash drive) could implement
So what you have is a complex interaction between hardware (disk
drive), OS and application design decisions. Buffered IO greatly
simplifies things for application developers. But then you have to
live with the default OS algorithm which is usually optimized for
performance. Direct IO improves performance and provides greater
control over how data is read from the drive; but there are special
rules for read access that are imposed by the limitations of the
DD was written before the modern buffered vs. unbuffered IO
distinction. I rather suspect that it predates the advent of
buffered IO, if there is someone around who is able to remember back
that far. Adapting DD to use unbuffered (direct) IO was no simple
task, not the least of which is because DD is supposed also to be
able to read regular files which may be encrypted or compressed or
sparse. That is one reason why we chose to rewrite the current
released version of FAU-DD starting from scratch.
Using 'comp=noerror' (or 'comp=noerror,sync' on *nix) is the correct
algorithm when properly implemented. But what is "proper" could
change with the next hot fix or service pack or generation of
drives. So we need to constantly test and retest. Thanks for taking
the time to test this. Your efforts will benefit the entire