OOPS in 2.2.19 : Unable to handle kernel NULL pointer dereference
[2.] Full description of the problem/report:
kernel 2.2.19 produces this oops randomly on my machine.
[3.] Keywords (i.e., modules, networking, kernel):
kernel, filesystem, raid, smp
[4.] Kernel version (from /proc/version):
Linux version 2.2.19 (root@deathstar.gkg-com.com) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #1 SMP Fri Apr 27 10:49:53 CDT 2001
[5.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/oops-tracing.txt)
Options used: -V (default)
              -o /lib/modules/2.2.19 (specified)
              -k /proc/ksyms (specified)
              -l /proc/modules (specified)
              -m /boot/System.map-2.2.19 (specified)
              -c 1 (default)
Oct 22 08:52:09 deathstar kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000100
Oct 22 08:52:09 deathstar kernel: current->tss.cr3 = 0bd8d000, %%cr3 = 0bd8d000
Oct 22 08:52:09 deathstar kernel: *pde = 00000000
Oct 22 08:52:09 deathstar kernel: Oops: 0000
Oct 22 08:52:09 deathstar kernel: CPU:    0
Oct 22 08:52:09 deathstar kernel: EIP:    0010:[find_buffer+104/144]
Oct 22 08:52:09 deathstar kernel: EFLAGS: 00010206
Oct 22 08:52:09 deathstar kernel: eax: 00000100   ebx: 00000007   ecx: 0007ce24   edx: 00000100
Oct 22 08:52:09 deathstar kernel: esi: 0000000d   edi: 00003006   ebp: 0004852c   esp: e17cbde0
Oct 22 08:52:09 deathstar kernel: ds: 0018   es: 0018   ss: 0018
Oct 22 08:52:09 deathstar kernel: Process postmaster (pid: 24405, process nr: 57, stackpage=e17cb000)
Oct 22 08:52:09 deathstar kernel: Stack: 0004852c 00003006 0007ce24 c012bd04 00003006 0004852c 00001000 0004852c
Oct 22 08:52:09 deathstar kernel:        c012c0a6 00003006 0004852c 00001000 0004852c 0004852c e2cd2498 e2cd2498
Oct 22 08:52:09 deathstar kernel:        00001000 c0143ecd 00003006 0004852c 00001000 00000000 0004852c e2cd2498
Oct 22 08:52:09 deathstar kernel: Call Trace: [get_hash_table+24/76] [getblk+30/324] [ext2_alloc_block+109/344] [block_getblk+305/616] [ext2_getblk+139/164] [__brelse+19/52] [ext2_file_write+1296/1572] [__brelse+19/52] [ext2_create+353/368] [permission+26/44] [open_namei+486/848] [filp_open+68/240] [filp_open+172/240] [sys_write+254/320] [ext2_file_write+0/1572] [system_call+52/56]
Oct 22 08:52:09 deathstar kernel: Code: 8b 00 39 6a 04 75 15 8b 4c 24 20 39 4a 08 75 0c 66 39 7a 0c
Code:  00000000 Before first symbol            00000000 <_IP>: <===
Code:  00000000 Before first symbol               0:	8b 00                	mov    (%eax),%eax <===
Code:  00000002 Before first symbol               2:	39 6a 04             	cmp    %ebp,0x4(%edx)
Code:  00000005 Before first symbol               5:	75 15                	jne     0000001c Before first symbol
Code:  00000007 Before first symbol               7:	8b 4c 24 20          	mov    0x20(%esp,1),%ecx
Code:  0000000b Before first symbol               b:	39 4a 08             	cmp    %ecx,0x8(%edx)
Code:  0000000e Before first symbol               e:	75 0c                	jne     0000001c Before first symbol
Code:  00000010 Before first symbol              10:	66 39 7a 0c          	cmp    %di,0xc(%edx)
Oct 22 09:13:54 deathstar kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000134
Oct 22 09:13:54 deathstar kernel: current->tss.cr3 = 12e61000, %%cr3 = 12e61000
Oct 22 09:13:54 deathstar kernel: *pde = 00000000
Oct 22 09:13:54 deathstar kernel: Oops: 0002
Oct 22 09:13:54 deathstar kernel: CPU:    0
Oct 22 09:13:54 deathstar kernel: EIP:    0010:[remove_from_queues+188/344]
Oct 22 09:13:54 deathstar kernel: EFLAGS: 00010206
Oct 22 09:13:54 deathstar kernel: eax: 00000100   ebx: de2a2e40   ecx: de2a2e40   edx: efdf3890
Oct 22 09:13:54 deathstar kernel: esi: 0000000c   edi: 00000000   ebp: 00000326   esp: d2fadeb8
Oct 22 09:13:54 deathstar kernel: ds: 0018   es: 0018   ss: 0018
Oct 22 09:13:54 deathstar kernel: Process postmaster (pid: 24996, process nr: 41, stackpage=d2fad000)
Oct 22 09:13:54 deathstar kernel: Stack: 0004952c c012baf2 de2a2e40 de2a2e40 c18fcec0 c012c372 de2a2e40 c01485cd
Oct 22 09:13:54 deathstar kernel:        de2a2e40 00000000 c6416000 00000000 c64160d4 00001000 c15ffc98 00000008
Oct 22 09:13:54 deathstar kernel:        00000400 00049206 00000000 00000326 c0148983 c6416000 0000000c c64160d0
Oct 22 09:13:54 deathstar kernel: Call Trace: [put_last_free+50/124] [__bforget+34/40] [trunc_indirect+493/668] [ext2_truncate+115/508] [ext2_delete_inode+102/140] [ext2_delete_inode+124/140] [iput+155/588] [d_delete+74/104] [ext2_unlink+371/404] [vfs_unlink+225/232] [sys_unlink+142/216] [system_call+52/56]
Oct 22 09:13:54 deathstar kernel: Code: 89 50 34 c7 01 00 00 00 00 89 02 c7 41 34 00 00 00 00 ff 0d
Code:  00000000 Before first symbol            00000000 <_IP>: <===
Code:  00000000 Before first symbol               0:	89 50 34             	mov    %edx,0x34(%eax) <===
Code:  00000003 Before first symbol               3:	c7 01 00 00 00 00    	movl   $0x0,(%ecx)
Code:  00000009 Before first symbol               9:	89 02                	mov    %eax,(%edx)
Code:  0000000b Before first symbol               b:	c7 41 34 00 00 00 00 	movl   $0x0,0x34(%ecx)
Code:  00000012 Before first symbol              12:	ff 0d 00 00 00 00    	decl   0x0
[6.] A small shell script or example program which triggers the
     problem (if possible)
None available.  This happens randomly.  Sometimes within a few days, sometimes
it takes weeks.
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
Linux deathstar.gkg-com.com 2.2.19 #1 SMP Fri Apr 27 10:49:53 CDT 2001 i686 unknown
Gnu C                  egcs-2.91.66
Gnu make               3.78.1
binutils               2.9.5.0.22
util-linux             2.10r
modutils               2.3.21
e2fsprogs              1.18
pcmcia-cs              3.1.8
Linux C Library        2.1.3
Dynamic linker (ldd)   2.1.3
Procps                 2.0.6
Net-tools              1.54
Console-tools          0.3.3
Sh-utils               2.0
Modules Loaded         3c59x DAC960
[7.2.] Processor information (from /proc/cpuinfo):
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 7
model name      : Pentium III (Katmai)
stepping        : 3
cpu MHz         : 501.143
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr xmm
bogomips        : 999.42
processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 7
model name      : Pentium III (Katmai)
stepping        : 3
cpu MHz         : 501.143
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr xmm
bogomips        : 999.42
[7.3.] Module information (from /proc/modules):
3c59x                  22480   2 (autoclean)
DAC960                 60848   3
[7.4.] SCSI information (from /proc/scsi/scsi)
# cat /proc/rd/c0/current_status
***** DAC960 RAID Driver Version 2.2.10 of 1 February 2001 *****
Copyright 1998-2001 by Leonard N. Zubkoff <lnz@dandelion.com>
Configuring Mylex DAC960PTL1 PCI RAID Controller
  Firmware Version: 4.08-0-37, Channels: 1, Memory Size: 8MB
  PCI Bus: 0, Device: 18, Function: 1, I/O Address: Unassigned
  PCI Address: 0xFC8FE000 mapped at 0xF0810000, IRQ Channel: 18
  Controller Queue Depth: 124, Maximum Blocks per Command: 128
  Driver Queue Depth: 123, Scatter/Gather Limit: 33 of 33 Segments
  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
  Physical Devices:
    0:0  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0NXK100007008KQ52
         Disk Status: Online, 17782784 blocks
    0:1  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0P60T00007012R69K
         Disk Status: Online, 17782784 blocks
    0:2  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0P4VE00007012R7QV
         Disk Status: Online, 17782784 blocks
    0:3  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0P3V700001005HKUC
         Disk Status: Standby, 17782784 blocks
    0:4  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0P3SW000070113RRF
         Disk Status: Online, 17782784 blocks
    0:5  Vendor: SEAGATE   Model: ST39175LW         Revision: 0001
         Serial Number: 3AL0P1MV00007012RDGX
         Disk Status: Online, 17782784 blocks
  Logical Drives:
    /dev/rd/c0d0: RAID-5, Online, 71131136 blocks, Write Thru
  No Rebuild or Consistency Check in Progress
[7.5.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):
This problem seems to occur when writing to the dac960 array usually.  I dont
know if that is coincidence or if the dac960 is the problem?
[X.] Other notes, patches, fixes, workarounds:
I've reported this (or a similar) oops several other times.  See:
http://groups.google.com/groups?selm=Pine.LNX.4.10.10106271124500.17066-100000%40galaxy.gkg-com.com
http://groups.google.com/groups?selm=linux.raid.Pine.LNX.4.10.10107161322570.12406-100000%40galaxy.gkg-com.com
http://groups.google.com/groups?selm=linux.raid.Pine.LNX.4.10.10108271318300.31572-100000%40galaxy.gkg-com.com
http://groups.google.com/groups?selm=linux.kernel.20011018091613.Q39861-100000%40mike.localdomain
http://groups.google.com/groups?selm=linux.kernel.20011018114227.O40415-100000%40mike.localdomain
Nobody seems to have any ideas on how to fix / whats causing it.
Mike
Hal 9000 - "Put down those Windows disks Dave....  Dave?  DAVE!!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/