Re: Bio pool & scsi scatter gather pool usage

Mark Peloquin (peloquin@us.ibm.com)
Thu, 18 Apr 2002 11:58:54 -0500


Andrew Morton wrote:
>>
>> In EVMS, we are adding code to deal with BIO splitting, to
>> enable our feature modules, such as DriveLinking, LVM, & MD
>> Linear, etc to break large BIOs up on chunk size or lower
>> level device boundaries.

> Could I suggest that this code not be part of EVMS, but that
> you implement it as a library within the core kernel? Lots of
> stuff is going to need BIO splitting - software RAID, ataraid,
> XFS, etc. May as well talk with Jens, Martin Petersen, Arjan,
> Neil Brown. Do it once, do it right...

It has always been my intention to post this initial prototype
to the mailing list (once I knew it worked) for others examine,
comment, and suggest improvements on. With the hopes of it
being incorporated into the kernel and being generally available
to everyone.

>> ...
>>
>> The allocation and initialization of the resulting split
>> BIOs seems to be correct and works in light loads. However,
>> under heavier loads, the assert in scsi_merge.c:82
>> {BUG_ON(!sgpnt)} fires, due to the fact that scatter gather
>> pool for MAX_PHYS_SEGMENTS (128) is empty. This is occurring
>> at interrupt time when __scsi_end_request is attempting to
>> queue the next request.

> You're not the only one... That is placeholder code which
> Jens plans to complete at a later time.

Ok, I wasn't aware it was a temporary solution.

>> ...
>>
>> Have I caused a problem by unrealistically increasing
>> pressure on the BIO pool by a factor of 8? Or have I
>> discovered a problem that can occur on very heavy loads?
>> What are your thoughts on a recommended solution?

> Hopefully, once scsi_merge is able to handle the allocation
> failure correctly, we won't have a problem any more.

Agreed.

> As a temp thing I guess you could increase the size of that
> mempool.

Yes, this should allow me to avoid the issue altogether.

Something still bothers me about this problem. When I used
a private BIO pool, this scatter gather pool didn't get
depleted. Since bdflush, whether awakened by mempool_alloc
or elsewhere, is driving IOs down. The total number of
IOs in both cases should be similiar. If that is a true
then it doesn't seem to explain why the scatter gather
pool only became depleted using the global BIO pool.

Further debugging how shown me that the scatter gather
pool becomes depleted (and can't be grown at interrupt
time) shortly after growing the first depleted pool
(either BIO or BIO vecs, usually BIO vecs) in IO drive
path.

Is there a cause and effect here? Unknown a this time.

Thanks.
Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/