changes to wait() in higher 2.4.x kernels? VM problem?

Hubbard, David (dhubbard@dino.hostasaurus.com)
Tue, 25 Jun 2002 16:07:47 -0400


Hi all, I'm experiencing a problem when using later kernels,
both Redhat and standard kernel.org. I'm not sure what all
details are needed so I'll try to give a lot.

The background:

I serve websites off of a variety of RedHat-based machines.
Many run a scripting engine called Empresa written by Miva
Corporation (http://www.miva.com) to process their own
MivaScript. It's very similar to PHP but lacks the features
and costs money. :-) This Empresa binary is built to be used
out of cgi-bin as a handler for .mv files. So in Apache you
add a handler for .mv to call the Empresa binary sitting in
your cgi-bin when serving a .mv file. This is the same way
PHP works when installed in cgi mode, it gets called for every
request.

The problem:

On RedHat 7.3 (2.4.18-based), I am seeing noticeable pauses
on the client side when viewing pages generated by this Empresa
binary. I've discovered that the problem is that Empresa does
it's job, returns the dynamic HTML to the client (shown by
a tcpdump), and then exits but stays in a defunct state for
maybe 1/2 to 1 second. During this pause, apache doesn't serve
the client any of the images it's requested that were referred
to in the html output from Empresa, tcpdump shows the requests
did come in though.

This behavior did not happen on older RedHat 7.2 installs, on
those Empresa runs and exits and it is very hard to even catch
it in a running state let along defunct. I tried 7.3 on a
variety of other machines, some AMD some Intel, same results.
So then I put the 7.2 kernel (2.4.9-based) on these 7.3 boxes,
problem goes away!! I built a 2.4.18 kernel from kernel.org
thinking it was RedHat, problem comes back. So it definitely
seems to be something relating to the kernel that has changed
between 2.4.9 and 2.4.18 regardless of being a RedHat modified
version or not. The company says Empresa is not at fault since
it's in defunct state and that means it has already exited.
They say the problem probably lies with wait() and the child
not being cleaned up in a timely fashion. Since I'm able to
make the problem appear and disappear solely by switching
between 2.4.9 and 2.4.18, I was thinking maybe it was a kernel
issue.

If someone can point out what more information I should provide
or tell me if such behavior is a known issue, please do, I'll be
happy to help.

Thanks a lot,

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/