Mailing List
Home
Forum Home
Linux - General Red Hat Linux discussion list
Installation - Getting started with Red Hat Linux
Enterprise Linux 3 - Discussion of Red Hat Enterprise Linux 3 (Taroon)
Red Hat Linux 9 - Discussion of Red Hat Linux 9 (Shrike)
Red Hat Linux 7.2 - Discussion of Red Hat Linux 7.2 (Enigma)
Red Hat Linux 7.3 - Discussion of Red Hat Linux 7.3 (Valhalla)
Apache Web Server
Oracle database, Microsoft SQL server ...
Subjects
application/x mplayer2 plugin
RPM error: db4 error(16) from dbenv >remove: Device or resource
   busy
Command stream end of file while reading
X Windows problem (xauth)
Upgrading openoffice 1 1 rpm
FTP: connection refused
FTP: connection refused
mount: /dev/cdrom: is not a valid block device
Dell Precision 650, RedHat 9, no sound
how to trace the cause resulting in the crash of bind server
Virus on the list
UNINSTALL RPM MYSQL
usb pen drives: mounting as a user
broadcom network interface
make mrproper
Couldn 't open PID file /var/run/named/named pid Permission denied
sendmail configuration on redhat
kernel 2 6 and /dev/sound/mixer not found
Promise 378 controller
Problem using up2date
mrtg step by step howto/configuration for a newbie?
Compiling and Installing Kernel 2 6
Can 't locate module ppp0, can 't locate module ppp compress 21
Lotus Notes under Wine
HOW I CAN MAKE BOOTABLE FLOPPY DISKET
/etc/security/limits conf question
Intel E/1000 driver
rpm database corrupt
Command stream end of file while reading
qla2300 modules
 
System hang problem.

System hang problem.

2006-10-03       - By Manish Neema

 Back
Reply:     1     2     3     4     5     6     7     8     9     10     >>  

Hi Tom,

Thanks for the reply.

We develop EDA S/W and most of our tools are pretty memory hungry. Most
of our systems are 2CPU, 16GB RAM, 32GB SWAP. Small percent of machines
have 32GB, 64GB and 128GB RAM. Our queuing system would dispatch jobs
equal to the # of CPUs on a machine so there are times when more than
one job will turn out to be memory intensive, causing the machine to
crawl/hang.  Also, it is an R&D environment so code running on the
machine may have memory leak problem at times.

Anyway, since the final memory requirement is not exactly known to the
users before submitting their jobs, we often see machines hanging. I
understand that OOM kill is bad (heuristics can cause any random process
to die) but believe it or not, it used to work perfectly fine for us in
RHEL3.0 U3 (and we are actually expecting RHEL3.0 U5 and U7 to exhibit
similar OOM kills), since none of the memory overcommit settings seems
to be helping effectively.

Is there any /proc knobs that can help limit process SIZE? I know
"limits.conf" allows controlling 'RSS' but we need a control for total
SIZE.

I would appreciate any further suggestions...

Thanks!
-Manish

-- --Original Message-- --
From: taroon-list-bounces@(protected)
[mailto:taroon-list-bounces@(protected)] On Behalf Of Tom Sightler
Sent: Tuesday, October 03, 2006 7:29 PM
To: Discussion of Red Hat Enterprise Linux 3 (Taroon)
Subject: Re: System hang problem.

On Tue, 2006-10-03 at 15:23 -0700, Manish Neema wrote:

> Any suggestions on how we can prevent system-hang + not have automount
> (and any other root process) die?

Perhaps this is a silly suggestion, but why wouldn't you just add more
memory/swap to keep the system from needing to invoke the OOM killer?
The system will not OOM kill a process until it's completely out of all
pages in a given zone.  It sounds like you don't have enough memory, if
you were relying on the OOM killer to keep your system running on U3
then you still had a problem, a normally running system should not
trigger the OOM killer.

If the memory usage is load driven (for example a dynamic web server
that sees large bursts of traffic) then you need to control the memory
allocation of the system by using the throttling features built into
these systems to limit concurrent connections to a reasonably
serviceable amount.

Is the system really running out of memory, or is it zone starvation?
I've seen cases on large memory systems (systems with 8GB+ of RAM),
where the OOM killer kicks in when low memory is starved even if large
amounts of memory are still available.

I would suggest describing a little more about your system (hardware,
RAM, swap) and application environment if you want more constructive
suggestions.

Later,
Tom



--
Taroon-list mailing list
Taroon-list@(protected)
https://www.redhat.com/mailman/listinfo/taroon-list

--
Taroon-list mailing list
Taroon-list@(protected)
https://www.redhat.com/mailman/listinfo/taroon-list