Re: dead mem pages -> dead machines

Marcelo Tosatti (marcelo@conectiva.com.br)
Tue, 10 Jul 2001 22:51:03 -0300 (BRT)


On Tue, 10 Jul 2001, Dirk Wetter wrote:

>
> On Tue, 10 Jul 2001, Mark Hahn wrote:
>
>
> > my point, perhaps to terse, was that you shouldn't run 4.3G of job on
> > a 4G machine, and expect everything to necessarily work.
>
> then i expect to have 300M+ swapped out and not 2.6GB. so what?
>
> > > > they should all have the same priority, so swapping is distributed.
> > > > currently sda5 fills (and judging by the 5, it's not on the fast
> > > > part of the disk) before sdb1 is used.
> > >
> > > well, that's the symptom, but not the disease, medically speaking.
> >
> > no, it's actually orthogonal, mathematically speaking:
> > your swap configuration is inefficient
>
> i am complaining about the fact that the machines start paging
> heavily without a reason and you are telling me that my swap
> config is wrong?
>
> > note also that swap listed as in use is really just allocated,
> > not necessarily used. the current VM preemptively assigns idle pages
> > to swapcache; whether they ever get written out is another matter.
>
> it IS used. after submitting the jobs the nodes were dead for a while since
> they were swapping like hell.
>
> > you should clearly run "vmstat 1" to see whether there's significant
> > si/so. it would be a symptom if there was actually a lot of *both*
> > si and so (thrashing).
>
> they were so dead, i couldn't even type on the console. load was up
> to 30, culprit at that point was kswapd.

Dirk,

Can you boot the kernel with "profile=2" and use the "readprofile" tool to
check where the kernel is wasting its time ? (take a look at the
readprofile man page)

Let me machine stay in the "unusable" state for quite some time before
reading the statistics.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/