Re: Linux/Pro -- clusters

Alan Cox (alan@lxorguk.ukuu.org.uk)
Thu, 6 Dec 2001 18:02:12 +0000 (GMT)


> Timeouts for different commands were so different that people ended up
> making most timeouts so long that they no longer made sense for other
> commands etc.

Thats per _target_ not host. Which needs to be common code.

> Other device drivers have been able to handle timeouts and errors on
> their own before, and have _not_ had the kinds of horrendous problems
> that the SCSI layer has had.

Every IDE layer uses the same IDE error handling code, because every IDE
driver would otherwise have to make a copy of it - ditto scsi.

> that it is a major mistake to try to have generic error handling. The
> only true generic thing is "this request finished successfully / with an
> error", and _no_ high-level retries etc. It's up to the driver to decide
> if retries make sense.

Retries and retry handling are target specific not host specific (think
about the ton of logic you need every time your cd rom decides to error
a read). You can have a read turn into a sequence of operations while you
go and work out why it failed, ask it if its ready, tell it to lock the
door, spin up the media, wait for it to be ready, reissue the I/O.

This processing has to be robust because scsi cd-roms for example are
rarely robust themselves.

So its very much

request->controller
libscsi -> make me a command block
issue command

interrupt->controller
error ?
libscsi recommend an action please
add suggested recovery to queue head
kick request handling

> (Often retrying _doesn't_ make sense, because the firmware on the
> high-end card or disk itself may already have done retries on its own,
> and high-level error handling is nothing but a waste of time and causes
> the error notification to be even more delayed).

Those devices aren't SCSI controllers, and they don't want to appear as one.
Thats a horrible windows NT habit that harms performance badly. Of course
everyone is now doing it with Linux because someone wouldn't provide more
major numbers.

Which is another thing - can you make the internal dev_t 32 or 64bits now.
You can have 65536 volumes on an S/390 so even with perfectly distributed
devfs allocated device identifiers - we don't have enough.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/