Re: SCSI/Block boot problems

Russell King (rmk@arm.linux.org.uk)
Sat, 18 Jan 2003 18:58:19 +0000


On Sat, Jan 18, 2003 at 04:57:02PM +0000, Russell King wrote:
> 3. SCSI goes completely gaga after a SCSI disk IO error. I haven't
> got much to say about this other than to supply the kernel messages
> (with some extra ones added to try to track down the problem.)
>
> At this point, we are trying to read the partition table on the
> aforementioned empty SCSI removable drive:
>
> sda:submitting buffer 0 of 1 (cc3fa580) page c026e3c0
> submission done
> prep_rq_fn: device sda ret = 1

Additional debugging shows that the above is due to a suspected media
change - we are dropping out of 2.5.59 drivers/scsi/sd.c:238
(sd_init_command(), sdp->changed true).

It would appear that when we return to scsi_prep_fn(), we release
any buffers allocated to the command structure (via scsi_release_buffers)
but we don't actually free the SCSI command structure which was allocated
via scsi_allocate_device().

This means that we drop one SCSI command structure on the floor each time
we detect the media has changed in a removable media device, which then
causes us to run out of SCSI command structures, eventually bringing the
device to a complete halt.

Unfortunately, SCSI command structures can come from req->special, and
it is unclear to me at present whether these should be freed as well.
Therefore, someone more knowledgeable of the implementation in this
area needs to review this.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/