Yesterday, Linus kindly directed me towards the filesystem implementation 
concerning this, so I produced some diagnostics in that direction. The 
task is to write a big file with dd to a hardware RAID on a dual Athlon MP 
machine, while in the meantime checking the progress with ls -l. Here's 
the relevant part of 'strace -T ls -l' (it is not always so slow)
open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3 <0.000013>
fstat64(3, {st_mode=S_IFDIR|0755, st_size=160, ...}) = 0 <0.000007>
fcntl64(0x3, 0x2, 0x1, 0x3)             = 0 <0.000011>
brk(0x805b000)                          = 0x805b000 <0.000008>
getdents64(0x3, 0x8058ac8, 0x1000, 0x8058ac8) = 176 <4.726011>
lstat64("bigfile_2", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0 <0.000023>
lstat64("bigfile_3", {st_mode=S_IFREG|0644, st_size=726581248, ...}) = 0 <0.000012>
lstat64("writer", {st_mode=S_IFREG|0755, st_size=58, ...}) = 0 <0.000012>
lstat64("bigfile_", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0 <0.000012>
getdents64(0x3, 0x8058ac8, 0x1000, 0x8058ac8) = 0 <0.000033>
close(3)                                = 0 <0.000013>
The first call to getdents64 takes 4.7s! I captured the task status and 
got this:
writer        S 00000018  4732  1652      1  1653          1242 (NOTLB)
Call Trace:    [sys_wait4+952/1008] [system_call+51/56]
dd            D 0000001A     0  1653   1652                     (NOTLB)
Call Trace:    [sleep_on+74/112]
        [do_journal_begin_r+87/544]
        [unmap_underlying_metadata+26/96]
        [__block_prepare_write+231/752]
        [call_reschedule_interrupt+5/11]
        [journal_begin+19/32]
        [reiserfs_commit_write+136/336]
        [block_prepare_write+29/64]
        [generic_file_write+1256/1776]
        [sys_write+149/272]
        [do_IRQ+197/240]
        [system_call+51/56]
ls            D 080541A4  5084  1659   1562                     (NOTLB)
Call Trace:    [sleep_on+74/112]
        [do_journal_begin_r+87/544]
        [search_by_key+2338/3552]
        [journal_begin+19/32]
        [reiserfs_dirty_inode+88/160]
        [__mark_inode_dirty+46/160]
        [update_atime+81/96]
        [reiserfs_readdir+1112/1136]
        [filemap_nopage+233/528]
        [rb_insert_color+112/240]
        [vfs_readdir+124/192]
        [filldir64+0/320]
        [sys_getdents64+79/186]
        [filldir64+0/320]
        [system_call+51/56]
writer is a shell script calling dd with a blocksize of 1M. The problem of 
course vanishes when using noatime, but it still makes me wonder why a 
single write request is delayed so long. The other possibility to avoid 
this is to use a sync mount, and there I discovered something really 
strange: 2.4.19 gives me about 17MB/s while 2.4.18-3 (RedHat) was creeping 
slow with 10kB/s!
If you have any thoughts on the cause of this behaviour and/or on how to 
fix it, I would be glad to hear them. If it's not too complicated I could 
even code something up myself, and I for sure can do any testing needed.
Thanks in advance,
					Roland
+---------------------------+-------------------------+
|    TU Muenchen            |                         |
|    Physik-Department E18  |  Raum    3558           |
|    James-Franck-Str.      |  Telefon 089/289-12592  |
|    85747 Garching         |                         |
+---------------------------+-------------------------+
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/