Red Hat Linux release 4.1 (Vanderbilt)
Kernel 2.0.27 on an i586
using 128MB for swap space (I always use the same as the RAM):
/sbin/fdisk /dev/hda1
Command (m for help): p
Disk /dev/hda: 64 heads, 63 sectors, 648 cylinders
Units = cylinders of 4032 * 512 bytes
Device Boot Begin Start End Blocks Id System
/dev/hda1 1 1 66 133024+ 82 Linux swap
/dev/hda2 67 67 648 1173312 83 Linux native
I have had no problems for over one year.
While initially setting up the system I would get some kernal panic errors
relating thw the scsi drives. I determined them to be one of the following:
1. bad scsi cable
2. bad hard drive
3. bad controller card
Here are the things I recommend trying to fix your errors:
1. test both your scsi cards for the swap space, basically just swap
them, or remove the other one, which you did not removed in your
previous test.
2. do the same for your scsi drives, move the swap space to your other drive.
3. If you have a ide port on your system, put a ide drive on your system and
move the swap partition onto the ide drive.
If my suggestions help you, maybe you can offer some help when I am ready
to set up raid on my system.
Good Luck,
Brian
On Tue, 2 Dec 1997, David Mansfield wrote:
> Hello, I've been trying to decide whether to set up a production web
> server/dial-in server using the RAID 1 mirroring. I've set it up and it
> seems to work OK but I've gotten a couple of oopses and a lot of
> interesting syslog kernel messages. I am close to suspecting bad memory,
> although the memtest (run about 10 times) doesn't show anything. The
> system looks like this:
>
> Pentium 150.
> 64 MB ram.
> No ide drives
> Adaptec 2940UW with 2x Quantum HD
> (Note: I was running with twin adapters for a while and trimmed to
> one, which didn't eliminate the problems)
> Kernel 2.0.32 with raid145-0.36.3-2.0.30.gz patch.
> (Note: although the patch is for 2.0.30 it applied cleanly...)
> raidtools-0.41
>
> The rest is a stock RedHat 4.2 distribution.
>
> My tests are the following:
> --- test 1 ---
> cd /usr/src/linux
> while true; do make dep; make clean; make zImage; make modules; done
> --- test 2 ---
> while true; do cp -a /usr/src/linux /tmp/test; rm -r /tmp/test; done
> --- test 3 ---
> short c program that mallocs 10 mb and writes a random value to a random
> spot in this buffer (keeps all 10mb swapping)
> ----
>
> I ran test1 + test2 + (7 x test3) to stress test the system. Note, since
> the system has only 64 MB this puts me about 20MB into swap.
> ~
> Here are the results lots of these (for some reason my syslog has
> disappeared, but there are a number of these errors, at least 40 over the
> period of 12 hours)
>
> kernel: Internal error: bad swap device
> kernel: rw_swap_page: weirdness
> kernel: swap_free: weirdness
> kernel: Trying to free non-existant swap page
> kernel: Trying to swap to non swap device
>
>
> One of:
> Dec 2 10:40:08 tempiws kernel: Unable to handle kernel paging request at
> virtual address 081c8000
> Dec 2 10:40:08 tempiws kernel: current->tss.cr3 = 039a9000, 8r3 =
> 039a9000
> Dec 2 10:40:08 tempiws kernel: *pde = 00bbd067
> Dec 2 10:40:08 tempiws kernel: *pte = 68747561
>
> And three oops (first two copied by hand):
> CPU: 0
> EIP: 0010 [<00123d7a>]
> EFLAGS: 00010246
> eax: 00001800 ebx: 52565253 ecx: 0381944c edx: 00000c00
> esi: 00000bc1 edi: 00000000 ebp: bffffe60 esp: 03993f84
> ds:0018 es:0018 fs:002b gs:0026 ss:0018
> Process update (pid: 305, process nr:27, stackpage:03993000)
> Stack 0031b810 00000000 00000000 00126c94 00000000 00000000 0031b810
> 00000000
> 00000000 0031b810 00126df1 0031b810 00000001 0010a86d 00000001
> 00000000
> 00000000 00000001 00000000 bffffe60 ffffffda 0000002b 0000002b
> 0000002b
> Call Trace: [<00126c94>] [<00126df1>] [<0010a86d>]
> general protection: 0000
>
> and ksymoops says:
> Using `/usr/src/linux/System.map' to map addresses to symbols.
>
> >>EIP: 123d7a <sync_inodes+1e/58>
> Trace: 126c94 <sync_old_buffers+14/13c>
> Trace: 126df1 <sys_bdflush+35/98>
> Trace: 10a86d <system_call+55/7c>
>
> Second oops:
> CPU 0
> EIP: 0010: [<0011ac2b>]
> EFLAGS: 00010246
> eax: 00000000 ebx: 00fd2bfc ecx: 00000400 edx: 02001000
> esi: 00fae660 edi: ds001000 ebp: 0009ad98 esp: 00006f58
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process init (pid 1; process nr: 1; stackpage=00006000)
> Stack: 0011aa5x bfd98f68 00099414 00099414 fffffff3 00105025 0010a4f3
> 000998b4
> 00105025 00105025 00111624 00099414 0009ad98 bfd98000 00000001 00111508
> 00000002 0804bccc bfd9906c 00099414 0377f618 bfd99720 0010a9d0 00006fbc
> Call Trace: [<0011aa5c>] [<0010a4f3>] [<00111624>] [<00111508>]
> [<0010a9d0>]
> Code: f3 ab 0b 55 0c 89 54 24 18 89 54 24 1c 8b 44 24 18 0c 40 89
>
> and ksymoops says:
> Using `/usr/src/linux/System.map' to map addresses to symbols.
>
> >>EIP: 11ac2b <do_no_page+1cf/328>
> Trace: 11ac2b <do_no_page+1cf/328>
> Trace: 10a4f3 <handle_signal+5b/90>
> Trace: 111624 <do_page_fault+11c/310>
> Trace: 111624 <do_page_fault+11c/310>
> Trace: 10a9d0 <error_code+40/48>
>
> Code: 11ac2b <do_no_page+1cf/328> repz stosl %eax,%es:(%edi)
> Code: 11ac2d <do_no_page+1d1/328> orl 0xc(%ebp),%edx
> Code: 11ac30 <do_no_page+1d4/328> movl %edx,0x18(%esp,1)
> Code: 11ac34 <do_no_page+1d8/328> movl %edx,0x1c(%esp,1)
> Code: 11ac38 <do_no_page+1dc/328> movl 0x18(%esp,1),%eax
> Code: 11ac3c <do_no_page+1e0/328> orb $0x40,%al
> Code: 11ac3e <do_no_page+1e2/328> movl %eax,(%eax)
> Code: 11ac40 <do_no_page+1e4/328> nop
> Code: 11ac41 <do_no_page+1e5/328> nop
> Code: 11ac42 <do_no_page+1e6/328> nop
>
> Third oops (this one got logged so the symbols are already here):
> Oops: 0009
> CPU: 0
> EIP: 0010:[ext2_file_write+585/1116]
> EFLAGS: 00010216
> eax: 028c8598 ebx: 00000400 ecx: 00000100 edx: 034f2400
> esi: 081c8000 edi: 034f2400 ebp: 00000400 esp: 038efc04
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process cc1 (pid: 4190, process nr: 33, stackpage=038ef000)
> Stack: 000c0000 0814b000 0814b000 0010b000 00000000 00000000 001d8dfc
> 0007d000
> 00000000 00000210 00084000 00000000 028c8598 00000000 03bf8a00
> 038efc90
> 00eab500 00008180 03bf8a00 00bd9798 038efc90 00eab500 00125e1a
> 00bd9798
> Call Trace: [do_coprocessor_segment_overrun+4/60] [__brelse+34/68]
> [ext2_create+
> 341/360] [dump_write+28/44] [writenote+167/200] [dump_write+28/44]
> [elf_core_dum
> p+2488/2640]
> [do_no_page+620/808] [timer_bh+193/820] [do_signal+495/632]
> [signal_retur
> n+18/56]
> Code: 64 f3 a5 83 e3 03 89 d9 64 f3 a4 55 8b 54 24 34 8b 52 24 03
>
>
> Does anyone have an opinion on whether RAID 1 is ready to play with the
> big boyz? Should I tuck this one away and try again in 6 months? Does
> it look like processor/memory weirdness? Other experiences and or
> comments welcome.
>
> David Mansfield
> david@cobite.com
>
>
>
_/ _/ _/ _/ _/ _/ _/ Brian Adams
_/ _/ _/ _/_/ _/ _/ _/ adams@xws.com
_/ _/ _/ _/ _/ _/ XWS/Sawtooth Technologies
_/ _/ _/_/ _/_/ _/ _/ http://www.xws.com
_/ _/ _/ _/ _/ _/ 509-427-4865