Hangs seen with the 3ware controller and the 2.4.17 kernel ...

Manish Lachwani (manish@Zambeel.com)
Mon, 11 Nov 2002 23:54:27 -0800


Hello,

I am using a 2.4.17 SMP kernel and .018 version of the 3ware driver. This
happens when we have two controllers (8-port and 4-port), IO is going on
with both the controllers and on one controller (4-port in my experiment),
there is command timeout and the reset sequence fails. This is a hard
kernel hang. The last message on the window is "reset sequence failed"

kdb for the eh_1 shows:

scsi_error_handler -> scsi_unjam_host -> scsi_try_host_reset -> schedule

I do know that the scsi_try_host_reset(..) calls scsi_sleep for 10*HZ.

Anyway, another scenario that causes a hang:

scsi_error_handler -> scsi_unjam_host -> scsi_try_to_abort_command ->
schedule

Also, This hang seems to occur when there are two controllers only. When I
tried with
one controller numerous times, I could not reproduce this problem. Is it
possible that scsi_unjam_host is getting confused with two devices and when
reset fails on one host?

Any help is appreciated ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/