Re: Hardware problem?
- Dear all, dear ekpulse,
if you do have a hardware problem, then it looks like I have it too
because my machine behaves exactly as you describe.
However, after the last thread and my post I did some more testing as
indicated by someone else on this list. I hooked up the NSLU2
directely to my laptop with a very short xover cable and... sure
enough, the problems remained. Same things happened (oopsing and
crashing while copying very large files).
Then I did a different test to understand whether it's a problem with
the network. I attempted to do several copies of a 1.2 Gbyte file from
the NSLU2 disk right into the same disk, no network involved and...
yes, oopsed and crashed again. I rebooted, fsck'ed the disk, tried to
remove whatever was left from the crashed copies and... crashed again.
Finally performing one remove at a time I succeded to remove the copies.
The oopses from copying/removing both start with:
Unable to handle kernel paging request at virtual address 00200000
pgd = c1830000
Internal error: Oops: f5 [#1]
Modules linked in: tcp_diag ixp425_eth ixp400 ext3 jbd ipv6 ext2 mbcache
and then, I noticed a (perhaps) strange thing. The 'free' command gave
me, right after each crash, what follows:
total used free shared buffers
Mem: 30680 29548 1132 0 284
Swap: 56216 0 56216
Total: 86896 29548 57348
That is: RAM almost to capacity but swap untouched. Could it possibly
be that the kernel crashes at the moment of swapping because of some
problem with the swapfile or something? I have properly formatted the
swap space with -v1, and the disk does not have bad blocks. But other
times (like when running fsck, I can see the swap being used) so this
could be completely wrong.
ekpulse, do you observe oops traces in your messages file
(/var/log/messages, look for 'Oops')? It would be really good if we
could pin this down. Any other hint?
P.S. I have also installed and modprobe'd the tcp_diag module, but I
don't have a clue on how to use it.
P.S. My machine does exactly the same boot sequence as the one you
posted. Does'nt seem to be a problem (but I might be wrong).
--- In email@example.com, "epkulse" <uffe.z@t...> wrote:
> I´m now trying to run Openslug. Have previously tried FW r24 and r63.
> Also tried Unslung BETA 5.5. All the time, though, I have encountered
> instability of all kinds. In the beginning mainly when handling disk
> accesses. But now, even when setting up Openslug it is unstable. Lots
> of different interesting situations. Anyway - I´m more and more
> convinced that something is incorrect in the HW. I have enclosed some
> errors that I encounter in the dmesg - perhaps someone can verify if
> these errors could indicate a faulty HW?
> Linux version 220.127.116.11 (openslug@o...) (gcc version 3.4.4) #1
> Mon Aug 8 08:52:36 PDT 2005
> CPU: XScale-IXP42x Family [690541f1] revision 1 (ARMv5TE)
> CPU0: D VIVT undefined 5 cache
> CPU0: I cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
> CPU0: D cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets
> Machine: Linksys NSLU2
> Warning: bad configuration page, trying to continue
> Memory policy: ECC disabled, Data cache writeback
> On node 0 totalpages: 8192
> DMA zone: 8192 pages, LIFO batch:3
> Normal zone: 0 pages, LIFO batch:1
> HighMem zone: 0 pages, LIFO batch:1
> Built 1 zonelists
> Kernel command line: root=/dev/mtdblock4 rw rootfstype=jffs2
> mem=32M@0x00000000 init=/linuxrc reboot=s noirqdebug
> ixp400: module license 'unspecified' taints kernel.
> Module init.
> Initializing IXP425 NPE Ethernet driver software v. 1.1A
> ixp425_eth: CPU clock speed (approx) = 0 MHz
> [error] ixEthMiiPhyScan : unexpected Mii PHY ID 00008201
> ixp425_eth: npe0 is using the PHY at address 0
> ixp425_eth: npe1 is using the PHY at address 1
> NET: Registered protocol family 17
> ixp425_eth: ixEthMiiLinkStatus failed on PHY0.
> Can't determine
> the auto negotiated parameters. Using default values.
- Not sure if this will help anyone, but I was having a very similar problem
where backing up large amounts of data to the NSLU2 would result in the
transfer coming to a halt after a while. The nslu2 became totally
unresponsive from that point on. At first I was placing the blame on my
Maxtor OneTouch II for going into power save mode, but after further
investigation I discovered the real problem. My nslu2 was over-heating. I
had it laying flat on a carpeted area which caused improper ventilation
which caused it to heat up. As soon as I started using the stand instead of
laying it flat, the problem vanished and it has been working great ever
since. It is on 24/7 and I perform fairly large nightly backups with no
problems. The OneTouch power saving works great and has no undesirable side
effects at all so far. Perhaps this will help someone solve the problem.