Re: FW: am-utils or kernel bug ? Seems to be kernel or glibc bug...

Nicolas Turro (Nicolas.Turro@sophia.inria.fr)
14 May 2003 10:27:41 +0200


On Wed, 2003-05-14 at 04:23, Ion Badulescu wrote:
> > > i am running Redhat 9.0 ( kernel 2.4.20 )
> > > and am-utils (am-utils-6.0.9-2) (because i need the browsing
feature
> > > that automount doen't support).
> > >
> > > Unfortunatelly, amd sometimes hangs at boot time during its
> > > initialization (/etc/rc.d/init.d/amd ).
> > > I can reproduce this bug with /etc/rc.d/init.d/amd start / stop
> > > sequences, sometimes the start hangs sometimes it works.
> > > This bug occurs on ALL RedHat 9.0 boxes we have (7 PC with totally
> > > different hardware).

...

> > > [root@redhat-serv root]# strace -p 2454
> > > futex(0x4212e1c8, FUTEX_WAIT, -2, NULL <unfinished ...>
> > >
> > >
> > > [root@redhat-serv root]# strace -p 2455
> > > select(1024, [4 5 6 7], NULL, NULL, {932, 980000} <unfinished ...>
>
> I'll be damned if I understand what the futex is used for here. But since
> that's the parent amd, presumably it's waiting for the child to complete
> something, probably a mount.
>
> As for the second trace, we need to know what the four filedescriptors are
> for. 'lsof -p 2455' should shed some light...
>
> I suspect either a bug in glibc (likely), or a bug in the way amd uses
> some Unix primitives and which just happen to work on older glibc's (less
> likely). It's going to be rather hard to debug, however, if we can't
> reproduce it locally.
>
> Another suggestion I have is this: boot into an older kernel without futex
> support (2.4.18-27.7.x should do just fine, ignore the missing
> dependencies because they are not fatal). Glibc will adjust to the older
> kernel and use other mechanisms, and we'll see if the hang still occurs.
> Basically, since futexes were back-ported by Red Hat from 2.5 kernels, I
> suspect there might be some bugs or races in there, and this test would
> help to clear it out.

You were right, Ion,
switching to a RH8 kernel ( 2.4.18-14 ) , solved the issue. I cannot
reproduce this futex bug on the father process...

Who should i contact in order to correct things ?

-- 
Nicolas Turro <Nicolas.Turro@sophia.inria.fr>
INRIA

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/