  | | | System hang problem. | System hang problem. 2006-10-03 - By Paul Krizak
Back We've had similar issues with the OOM killer in RHEL3. We've found it to be pretty much worthless on our large memory systems. We've kept it enabled on our <= 8GB servers because it seems to work reasonably well there, but in many cases, the OOM killer will kill important things like the portmapper or ypbind instead of the giant 12G process that was hosing the system. I'm genuinely interested in understanding how the OOM killer algorithm selects processes to kill...
We're running RHEL3U8 btw...
Paul Krizak 5900 E. Ben White Blvd. MS 625 Advanced Micro Devices Austin, TX 78741 Linux/Unix Systems Engineering Phone: (512) 602-8775 Silicon Design Division
Manish Neema wrote: > We see this problem frequently on RHEL3.0 U5 and U7. System would > completely hang upon memory shortage. The only option left is > power-cycle (or 'sysrq + b'). System hang occurs with any of the below 3 > overcommit settings: > > - default (heuristic) overcommit (overcommit_memory=0) > - no overcommit handling by kernel (overcommit_memory=1) > - restrictive overcommit with ratio=100% (overcommit_memory=2; > overcommit_ratio=100) > > RHEL3.0 U3 would generate an OOM kill "each and every time" it sensed > system hang but due to other bugs, we had to move away from it. RedHat > support calls the timely (at least for us) invocation of OOM in U3 a > buggy implementation and the delayed OOM kill in U5 and U7 the right > implementation (which we rarely get to see resulting in at least 5 > systems hanging daily!) > > Changing overcommit to 2 (and ratio to any where from 1 to 99) would > result in certain OS processes (automount daemon for e.g.) getting > killed when all the allowed memory is committed. What is the point in > reserving some memory if a random root process would get killed leaving > the system in a totally unknown state? > > Any suggestions on how we can prevent system-hang + not have automount > (and any other root process) die? > > TIA, > -Manish Neema > > P.S. Sorry, we cannot move away from RHEL3.0 U7 for a while. > > -- > Taroon-list mailing list > Taroon-list@(protected) > https://www.redhat.com/mailman/listinfo/taroon-list > >
-- Taroon-list mailing list Taroon-list@(protected) https://www.redhat.com/mailman/listinfo/taroon-list
|
|
 |