Re: Linux/Pro -- clusters

Linus Torvalds (torvalds@transmeta.com)
Thu, 6 Dec 2001 16:58:34 +0000 (UTC)


In article <E16BlnL-00080m-00@the-village.bc.nu>,
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
>You still need the scsi code. There are a whole sequence of common, quite
>complex and generic functions that the scsi layer handles (in paticular
>error handling).

Well, the preliminary patches already handle _some_ common things, like
building the proper command request for reads and writes etc, and that
will probably continue. We'll probably have to have all the old helpers
for things like "this target only wants to be probed on lun 0" etc.

I disagree about the error handling, though.

Traditionally, the timeouts and the reset handling was handled in the
SCSI mid-layer, and it was a complete and utter disaster. Different
hosts simply wanted so different behaviour that it's not even funny.
Timeouts for different commands were so different that people ended up
making most timeouts so long that they no longer made sense for other
commands etc.

Other device drivers have been able to handle timeouts and errors on
their own before, and have _not_ had the kinds of horrendous problems
that the SCSI layer has had.

We'll see what the details will end up being, but I personally think
that it is a major mistake to try to have generic error handling. The
only true generic thing is "this request finished successfully / with an
error", and _no_ high-level retries etc. It's up to the driver to decide
if retries make sense.

(Often retrying _doesn't_ make sense, because the firmware on the
high-end card or disk itself may already have done retries on its own,
and high-level error handling is nothing but a waste of time and causes
the error notification to be even more delayed).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/