PROBLEM: system crashes with oops message

Ken Schneider (kschneider@rtsx.com)
03 Dec 2002 10:52:43 -0500


system info below:

oops message:

Unable to handle kernel NULL pointer dereference at virtual address
0000002d
printing eip:
c0150dfc
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010:[<c0150dfc>] Not tainted
EFLAGS: 00010246
eax: dae03ae0 ebx: 00000000 ecx: 00000000 edx: c0720000
esi: 00000000 edi: 00000000 ebp: 00008001 esp: c0721f34
ds: 0018 es: 0018 ss: 0018
Process smbd (pid: 6105, stackpage=c0721000)
Stack: dae03ae0 ffffffff 0821c438 c0721f8c cf721cf8 c014a980 dae03ae0
00008001
00008000 c55cf000 0821c438 bfffed8c c160ddf8 00000000 00000004
d3dad760
c013e4e3 c55cf000 00008001 00000000 c0721f8c 0000001d d3dad760
c1c37ee0
Call Trace: [<c014a980>] [<c013e4e3>] [<c013e87e>] [<c0108b73>]

Code: f6 43 2d 10 74 59 f7 c5 00 08 00 00 0f 85 d7 00 00 00 8b 82

*********************************************************************

output of ksymoops:

ksymoops 2.4.3 on i686 2.4.18-4GB-SMP. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.18-4GB-SMP/ (default)
-m /boot/System.map-2.4.18-4GB-SMP (default)

Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

Unable to handle kernel NULL pointer dereference at virtual address
0000002d
c0150dfc
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010:[<c0150dfc>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: dae03ae0 ebx: 00000000 ecx: 00000000 edx: c0720000
esi: 00000000 edi: 00000000 ebp: 00008001 esp: c0721f34
ds: 0018 es: 0018 ss: 0018
Process smbd (pid: 6105, stackpage=c0721000)
Stack: dae03ae0 ffffffff 0821c438 c0721f8c cf721cf8 c014a980 dae03ae0
00008001
00008000 c55cf000 0821c438 bfffed8c c160ddf8 00000000 00000004
d3dad760
c013e4e3 c55cf000 00008001 00000000 c0721f8c 0000001d d3dad760
c1c37ee0
Call Trace: [<c014a980>] [<c013e4e3>] [<c013e87e>] [<c0108b73>]
Code: f6 43 2d 10 74 59 f7 c5 00 08 00 00 0f 85 d7 00 00 00 8b 82

>>EIP; c0150dfc <__get_lease+4c/24c> <=====
Trace; c014a980 <open_namei+3e4/69c>
Trace; c013e4e2 <filp_open+32/54>
Trace; c013e87e <sys_open+36/9c>
Trace; c0108b72 <system_call+32/40>
Code; c0150dfc <__get_lease+4c/24c>
00000000 <_EIP>:
Code; c0150dfc <__get_lease+4c/24c> <=====
0: f6 43 2d 10 testb $0x10,0x2d(%ebx) <=====
Code; c0150e00 <__get_lease+50/24c>
4: 74 59 je 5f <_EIP+0x5f> c0150e5a
<__get_lease+aa/24c>
Code; c0150e02 <__get_lease+52/24c>
6: f7 c5 00 08 00 00 test $0x800,%ebp
Code; c0150e08 <__get_lease+58/24c>
c: 0f 85 d7 00 00 00 jne e9 <_EIP+0xe9> c0150ee4
<__get_lease+134/24c>
Code; c0150e0e <__get_lease+5e/24c>
12: 8b 82 00 00 00 00 mov 0x0(%edx),%eax

1 warning issued. Results may not be reliable.

-----------------------------------------------------------------------------

Running SuSE 8.0 system with only SuSE supplied RPMS
Using system as a fileserver using samba w/winbind and XFS filesystem
and extended ACL's. Authentication against an NT 4.0 server for share
access.

Hardware:
Compaq Proliant 6000
512m ram
10k rpm disks in raid 5 array
4 Intel pentium pro 200mhz processors (zeon)

System crashes HARD intermittenly (1-20 days) and only way to recover is
by system power cycle.

Any help or redirection would be appreciated.

-- 
Ken Schneider
*NIX Administrator
Network Administrator
Radiation Therapy Services, Inc.
kschneider@rtsx.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/