Well I don't know diddly about the prefetch instruction, but.
> ...
> +
> + if (size > 128) {
> + int i;
> + for(i=0; i<size; i+=64) {
> + prefetch (kaddr+offset);
> + prefetch (kaddr+offset+(L1_CACHE_BYTES*2));
> + }
> + }
> +
> left = __copy_to_user(desc->buf, kaddr + offset, size);
> kunmap(page);
>
This needs to be inside ARCH_HAS_PREFETCH. Otherwise, for
architectures which do not implement prefetch, we have
for (i = 0; i < size; i += 64) {
{;}
{;}
}
and the compiler will *not* optimise this away. The compiler deliberately
leaves this construct alone because it assumes the programmer is asking for
a busywait loop.
We shouldn't add a busywait loop to file_read_actor(), yes?
<ot>
Reminds me of a version of the Microsoft C compiler back in about '85 which
"optimise" away
for ( ; ; )
;
Completely.
</ot>
Is prefetching 4k at a time optimal? Should the prefetch be embedded
in copy_*_user?
The code assumes that L1_CACHE_BYTES equates to the prefetch chunk.
Is this reasonable, or should it be abstracted out to a new PREFETCH_BYTES?
That `64' needs to be PREFETCH_BYTES * 2 or L1_CACHE_BYTES * 2, yes?
How come the code keeps prefetching the same address? Shouldn't
we be adding `i' to the address in there?
The code makes no attempt to align the prefetch address to anything.
Should it?
What happens if a prefetch spills over the end of the page and
faults?
> Comments ?
That'll do for now :)
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/