  | | | Source of shmget failure | Source of shmget failure 2005-10-03 - By Lachele Foley (Lists)
Back I have a cluster of 63 HP Proliant DL360's (plus a DL380 serving as the root node). Each has two processors and 2 G RAM.
# uname -a Linux our.server 2.4.21-27.ELsmp #1 SMP Wed Dec 1 21:59:02 EST 2004 i686 i686 i386 GNU/Linux
One of my users routinely complains of shmget failures.
Usually, the reason is obvious. Sample reasons are that the entry in /proc/sys/kernel/shmmax has "gone bad" (I've fixed this, I think) or that he has overrun the available hard drive space. In the latter case, the machine reboots itself and whatever reports errors to him calls that a shmget failure.
My best guess about the last failure is that there wasn't enough available RAM. free -o on the offending node gave:
# free -o total used free shared buffers cached Mem: 2055492 834620 1220872 0 36436 734132 Swap: 2097112 2096720 392
The trouble with this picture is that the node wasn't running anything except the usual background noise -- it certainly should not have required 0.83 G to do it. And, the machine had been in this "idle" state for a good 12+ hours. Shouldn't that be ample time for unused memory to be returned to "free" state?
After I rebooted, there wasn't any significant change in the number and type of running processes. However, there was a *lot* more free memory.
I started watching memory use. In the course of an hour, free reported that the memory use steadily increased (from about 256000 to 260500 -- it makes a lovely graph).
So, I've got my nose poked into "Understanding Virtual Memory" at this location:
http://www.redhat.com/magazine/001nov04/features/vm/
Am I looking in the right place? If not, I will gladly accept any suggestions. If I am in the right place, well, I'll still gladly accept suggestions, but don't feel obligated.
Thanks!
:-) Lachele
I have a cluster of 63 HP Proliant DL360's (plus a DL380 serving as the root node). Each has two processors and 2 G RAM.<br> <br> # uname -a<br> Linux our.server 2.4.21-27.ELsmp #1 SMP Wed Dec 1 21:59:02 EST 2004 i686 i686 i386 GNU/Linux<br> <br> One of my users routinely complains of shmget failures.<br> <br> Usually, the reason is obvious. Sample reasons are that the entry in /proc/sys/kernel/shmmax has "gone bad" (I've fixed this, I think) or that he has overrun the available hard drive space. In the latter case, the machine reboots itself and whatever reports errors to him calls that a shmget failure.<br> <br> My best guess about the last failure is that there wasn't enough available RAM. free -o on the offending node gave:<br> <br> <pre># free -o<br> total used free shared   ;buffers<br> cached<br>Mem: 2055492 834620 1220872 0   ;36436<br> 734132<br>Swap:   ;2097112 2096720 392 <br> <br></pre> <div style="text-align: left;"><span style="font-family: arial,sans-serif;">< /span>The trouble with this picture is that the node wasn't running anything except the usual background noise -- it certainly should not have required 0.83 G to do it. And, the machine had been in this "idle" state for a good 12+ hours. Shouldn't that be ample time for unused memory to be returned to "free" state?<br> <br> After I rebooted, there wasn't any significant change in the number and type of running processes. However, there was a *lot* more free memory.<br> <br> I started watching memory use. In the course of an hour, free reported that the memory use steadily increased (from about 256000 to 260500 -- it makes a lovely graph). <br> <br> So, I've got my nose poked into "Understanding Virtual Memory" at this location:<br> <br> <a href="http://www.redhat.com/magazine/001nov04/features/vm/">http://www .redhat.com/magazine/001nov04/features/vm/</a><br> <br> Am I looking in the right place? If not, I will gladly accept any suggestions. If I am in the right place, well, I'll still gladly accept suggestions, but don't feel obligated.<br> <br> Thanks!<br> <br> :-) Lachele<br> <br> </div>
-- Taroon-list mailing list Taroon-list@(protected) https://www.redhat.com/mailman/listinfo/taroon-list
|
|
 |