2.0.30: Lockups with huge proceses mallocing all VM

Karl Kleinpaste (karl@jprc.com)
03 Dec 1997 12:21:53 -0500


Sending this to the list on the suggestion of a private correspondent...

We have a situation where a machine having a great deal of total VM
(384Mbytes RAM, 400Mbytes swap; plus SMP besides) locks up in the face
of a certain program which mallocs VM like popcorn. It's a document
categorization system, passing over a huge corpus of training data.

The specific symptom is that the beast mallocs and mallocs like
there's no tomorrow, as it analyzes documents, building word lists,
collating similar documents, and doing some serious and arcane
statistical work on the mess. As it approaches occupation of the
entire system's available VM, performance drops precipitously, though
it remains responsive and usable with as little as 20Mbytes swap space
remaining. When it finally closes on that last 20 or 10 Mbytes, the
system simply hangs.

We can reproduce this fairly reliably. Before anyone had looked at
the code at all closely, some folks were surmising that perhaps Linux
was not guaranteeing the availability of backing store for freshly-
allocated pages, and that perhaps eventually Linux was getting stuck
looking for a free page when none were to be found.

I'm wondering whether this sort of lockup is analogous to the
fragmentation lockups recently mentioned by Bill Hawes and others. If
so, could someone direct me toward Mark Hemment or others doing work
of this sort?

I'm perfectly willing to wade into the kernel mem.mgmt code to figure
out what I can about this, though it sounds like others may be way out
in front on the issue. In the meantime, we're working around the
problem as best we can by imposing datasize limits (via ulimit) since
the problem only presents itself when the machine is out of aggregate
VM anyway -- it doesn't matter if we make this lone process die as
long as the machine as a whole survives.

We have about a dozen systems in similar configurations which all show
these symptoms when this particular application is allowed free reign.
The configuration on them all is...

uname -a:
Linux tsunami.jprc.com 2.0.30 #7 Tue Aug 19 13:55:57 EDT 1997 i686

/proc/pci:
PCI devices found:
Bus 0, device 19, function 0:
SCSI storage controller: Adaptec AIC-7881U (rev 0).
Medium devsel. Fast back-to-back capable. IRQ 11. Master Capable. Latency=64. Min Gnt=8.Max Lat=8.
I/O at 0xec00.
Non-prefetchable 32 bit memory at 0xffaff000.
Bus 0, device 18, function 0:
Ethernet controller: 3Com Unknown device (rev 0).
Vendor id=10b7. Device id=9050.
Medium devsel. IRQ 9. Master Capable. Latency=248. Min Gnt=3.Max Lat=8.
I/O at 0xef00.
Bus 0, device 7, function 1:
IDE interface: Intel 82371SB Natoma/Triton II PIIX3 (rev 0).
Medium devsel. Fast back-to-back capable. Master Capable. Latency=32.
I/O at 0xffa0.
Bus 0, device 7, function 0:
ISA bridge: Intel 82371SB Natoma/Triton II PIIX3 (rev 1).
Medium devsel. Fast back-to-back capable. Master Capable. No bursts.
Bus 0, device 0, function 0:
Host bridge: Intel 82441FX Natoma (rev 2).
Medium devsel. Fast back-to-back capable. Master Capable. Latency=32.

/proc/cpuinfo:
processor : 0
cpu : 686
model : Pentium Pro
vendor_id : GenuineIntel
stepping : 7
fdiv_bug : no
hlt_bug : no
fpu : yes
fpu_exception : yes
cpuid : yes
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic 11 mtrr pge mca cmov
bogomips : 199.07

processor : 1
cpu : 686
model : Pentium Pro
vendor_id : GenuineIntel
stepping : 7
fdiv_bug : no
hlt_bug : no
fpu : yes
fpu_exception : yes
cpuid : yes
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic 11 mtrr pge mca cmov
bogomips : 199.07

And generally, the boxes are:
- P6, no exceptions.
- varying memory from 128Mbytes through 384Mbytes.
- large swap areas; the beastliest has 400Mbytes.
- most are SMP duals, though the same catatonia happens in uniprocs
(and I no longer run this application on my uniproc desk machine).
- SEAGATE ST34371W discs.
- IDE CDU311.
- oh, the uniprocs are 2.0.28, SMPs are 2.0.30 (due to AFS
considerations a few months back -- our SMPs don't need it)

I've also seen the reports that 3c59x.c needs to be very recent (we
use 3c905 boomerangs, mostly), but the hangs we're seeing are quite
specific and reproducible as aggregate VM runs out, which does not
sound like the random 3c59x.c hang lately being reported.

Any clues welcome.

regards,
--karl