Re: hda: error: DMA in progress..

Martin Dalecki (dalecki@evision-ventures.com)
Fri, 21 Jun 2002 13:10:06 +0200


Użytkownik Jens Axboe napisał:

>>OK. We have now just one single place where IDE_DMA gets unset ->
>>udma_stop. This to too early to reset IDE_BUSY. However it well
>>may be that ide_dma_intr() simply doesn't care about IDE_BUSY.
>>Let's have a look...
>
>
> You can leave IDE_BUSY there, that's ok. It's not invalid for IDE_BUSY
> to be set while IDE_DMA gets cleared. That's expected.
>

>>Well lets look at ata_irq_intr, the end of it:
>>
>> * Note that handler() may have set things up for another
>> * interrupt to occur soon, but it cannot happen until
>> * we exit from this routine, because it will be the
>> * same irq as is currently being serviced here, and Linux
>> * won't allow another of the same (on any CPU) until we return.
>> */
>> if (startstop == ide_stopped) {
>> if (!ch->handler) { /* paranoia */
>> clear_bit(IDE_BUSY, ch->active);
>> do_request(ch);
>> } else {
>> printk("%s: %s: huh? expected NULL handler on
>> exit\n", drive->name, __FUNCTION__);
>> }
>> } else if (startstop == ide_released)
>> queue_commands(drive);
>>
>>I think the above needs more tough now...
>
>
> Same case as the one I described in the email following this, will only
> happen for TCQ with release interrupt enabled. Otherwise it's illegal to
> release the bus from the tcq interrupt handler. Since I removed all
> traces of that long ago, you can safely kill the
>
> } else if (startstop == ide_released)
> queue_commands(drive);
>
> part of it.

I'm glad to get confirmation on this. This leaves only one place, vide:
do_request, where we can queue up new commands. Much easier to trace and
makes queue_commands never run from IRQ context, which is simplyfiying
things too.

> The rest looks sane. If handler returns it's no longer busy
> (ide_stopped), we clear IDE_BUSY (IDE_DMA damn well better be cleared at
> this point as well!!) and let do_request() start a new request (heck or
> the same, we don't know and don't care).

Right now the handlers are expected to clear IDE_BUSY and ->handler
themself. I have now an idea: Could you add a reporting about
the handler function there:

if (test_bit(IDE_DMA, ch->active)) {
printk(KERN_ERR "%s: error: DMA in progress... %p\n", drive->name, ch->handler);
break;
}

And please take a short look at System.map.

This will show which IRQ handler is the culprit...

If it's indeed ide_dma_intr, let's have a look on it:

We see that it's calling udma_stop() immediately. This should
reset IDE_DMA unconditionally.. immediately on enty:

static inline int udma_stop(struct ata_device *drive)
{
clear_bit(IDE_DMA, drive->channel->active);

return drive->channel->udma_stop(drive);
}

Argh... There is a race in the above it should be:

static inline int udma_stop(struct ata_device *drive)
{
int ret = drive->channel->udma_stop(drive);
clear_bit(IDE_DMA, drive->channel->active);
return ret;
}

Or we should move the clar_bit down do ide_dma_intr and
silbings behind __ata_end_request().
And finally we don't clear the IDE_BUSY on this code path.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/