Hi Anton,
I ran the dbench test as per your suggesstions. Now I get similar throughput
numbers as yours.
But still the throughput improvement is not there for my patch. the reason, I
think, is that I didnot get too many hits to fget() routine. It will be helpful
if you can tell how you got fget() chewing up more than its fair share of CPU
time.
For 30 clients:
Base(2.4.2) - Average Throughput = 235.139 MB/S
Base + Files_struct patch - Average Throughput = 235.751 MB/S
I also did profiling while running these tests, using kernprof. The fget hits
are as below
Base(2.4.2) 304
Base + Files_struct patch 189
Though while doing kernprofile'ing my sample size is quite big (the top ranker
in profiling is "default_idle" with 28471 hits). The fget's hit count is quite
low compared to "default_idle".
As you can also see, the files_struct patch is able to reduce the number of hits
to fget by around 37%
I also saw the dbench.c code. It does creates no. of child processes but with
fork() and not through __clone().
I think the fget() will affect the performance in the scenarios where the
childs are created using clone() with CLONE_FILES flag set. That is when many
child processes share parent's files_struct and everybody tries to acquire the
same files->file_lock. And in those scenarios we should see considerable
performance improvement by using the files_struct patch as in the case of "chat"benchmark.
Regards,
Maneesh
-- Maneesh Soni <smaneesh@in.ibm.com> http://lse.sourceforge.net/locking/rclock.html IBM Linux Technology Center, IBM Software Lab, Bangalore, India - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/