Re: ide drive dying?

jbradford@dial.pipex.com
Sat, 7 Sep 2002 07:09:50 +0100 (BST)


> > Now that the Smart Suite S.M.A.R.T. applications are unmaintained, would
>
> what happened?

I'm not sure, but the last update to the S.M.A.R.T. Suite website, on 3 July this year, says that the page and the applications are no longer maintained.

Seems the Beta of version 2.0 never got finished either :-(.

> > there be any chance of implementing S.M.A.R.T. in to the kernel IDE code?
>
> what would be the benefit? as I understand it, smart is really
> a means of reporting long-term disk status, which is optimally done
> by user-space. even something exotic like failing over to a spare disk
> would clearly be best done in user-space.

You are right, the idea is to monitor the smart info, ideally from when the drive is new, but at least over a period of time, so that a change in it's behavior shows up.

> > I know the IDE code is already a nightmare, but it would be a nice feature.
>
> what did you have in mind?

Well, nothing very exotic, just some sanity checks on the SMART data when the IDE and SCSI interfaces are probed for devices. Something like:

* Device supports/does not support following SMART features:
* General attributes
* Vendor attributes
* Error log
* Selftest log
* Drive info

* SMART is currently enabled/disabled

* Total power-on time is currently foo hours

* Warning if any of the following is excessive:

* Last spin up time
* Calibration retry count
* UDMA CRC Error count

> > S.M.A.R.T. is terribly under used at the moment - most people don't even
> > know what it is. Infact, I could be wrong, but isn't a subset of
> > S.M.A.R.T. implemented on modern SCSI disks, too?
>
> I know that most people don't run it, but other than that, how is it
> underused?

Well, I can't see any reason for *not* using it where available - who wouldn't appreciate a warning on boot up, 'oh, by the way, /dev/hda is about to die in a couple of days :-)'

> > Monitoring of any kind is always a nice feature to have...
>
> certainly, though that doesn't mean it should move from userspace to
> kernel...

Agreed, there isn't any point in doing monitoring in kernelspace, but capabilities reporting, and sanity checks on boot might be useful.

John.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/