ext2 inode size (on-disk)

Alexander Viro (viro@math.psu.edu)
Thu, 19 Apr 2001 07:55:20 -0400 (EDT)


Erm... Folks, can ->s_inode_size be not a power of 2? Both
libext2fs and kernel break in that case. Example:

dd if=/dev/zero of=foo bs=1024 count=20480
mkfs -I 192 foo

corrupts memory and segfaults. Reason: ext2_read_inode() (same problem
is present in the kernel version of said beast) finds inode offset within
cylinder group piece of inode table, splits it into block*block_size+offset,
reads the block and works with the structure at given offset.

I.e. it does
group = (ino-1) / inodes_per_group;
number_in_group = (ino-1) % inodes_per_group;
offset_in_group = number_in_group * inode_size;
block_number = inode_table_base[group] + offset_in_group/block_size;
offset_in_block = offset_in_group % block_size

Guess what happens if inode crosses block boundary? Exactly.

AFAICS we have two sane solutions:

a) require inode size to be a power of 2

b) switch to

group = (ino-1) / inodes_per_group;
number_in_group = (ino-1) % inodes_per_group;
block_in_group = number_in_group / inodes_per_block;
number_in_block = number_in_group % inodes_per_block;
block = inode_table_fragments[group] + block_in_group;
offset_in_block = number_in_block * inode_size;

i.e. instead of current "pack inodes into piece of inode table and
pad it in the end" do "pack inodes into blocks padding the end of every block".

Something has to be done - right now mke2fs effectively mandates "inode size
is a power of 2" and as far as I'm concerned it's OK, but segfaulting is
a bit too drastic way of telling user "don't do it"...
Al

PS: can we assume that inodes_per_group is a multiple of inodes_per_block
or it isn't guaranteed?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/