1.300: Some info about the memory management system
From: Michael Coggins (MCOG@CHVM1.VNET.IBM.COM).
1. Does AIX use more paging space than other unix systems?
Under many scenarios, AIX requires more paging space than other unix
systems. The AIX VMM implements a technique called "early allocation of
paging space". When a page is allocated in RAM, and it is not a
"client" (NFS) or a "persistent" (disk file) storage page, then it is
considered a "working" storage page. Working storage pages are commonly
an application's stack, data, and any shared memory segments. So, when
a program's stack or data area is increased, and RAM is accessed, the
VMM will allocate space in RAM and space on the paging device. This
means that even before RAM is exhausted, paging space is used. This
does not happen on many other unix systems, although they do keep track
of total VM used.
Example 1:
Workstation with 64mb RAM is running only one small application that
accesses a few small files. Everything fits into RAM, including all
accessed data. On AIX, some paging space will already be used. On
other unix systems, paging space will be 100% free. Clearly, this is an
example that shows where we use more paging space than the other machines.
Example 2:
Same machine as above, except we are in an environment where many
applications are running with inadequate RAM. Also, the system is
running applications that are started, run, left idle, and not in
constant use. A session of FRAME running in a window, for example.
What happens is that eventually (theoretically) all applications will be
paged out at least once. On the AIX system and the other systems the
total paging requirements will be the same (assuming similar malloc
algorithm). The major difference is that the AIX system allocated the
paging space pages before they were actually needed, and the other
systems did not allocate them until they were needed. However, most
other systems have an internal variable that gets incremented as virtual
memory pages are used. AIX does not do this. This can cause the AIX
system to run out of paging space (virtual memory), even though malloc()
continues to return memory. This "feature" allows sparse memory
segments to work, but requires that all normal users of malloc()
(sbrk()) know how much virtual memory will be available (actually
impossible), and to handle a paging space low condition. A big problem.
There are some pretty obvious pros and cons to both methods of doing
Virtual Memory.
2. How much paging space do I need?
Concerning the rule of thumb of having 2 times RAM for paging space:
this is rather simplistic, as are most rules of thumb. If the machine
is in a "persistent storage environment", meaning that they have a few
small programs, and lots of data, they may not need even as much as 1
times RAM for paging space. For example, a 1GB database server running
on a 6000 with 256MB of RAM, and only running about 50MB of "working"
storage does not need 512MB of paging space, or even 256MB. They only
need the amount of paging space that will allow all their working
storage to be paged out to disk. This is because the 1GB database is
mostly "persistent storage", and will require little or no paging space.
Excessive paging space may simply mean wasted disk space. However,
avoid insufficient paging space. Tip: Don't have more than one paging
space per disk. Tip: Put lots of RAM in your system - it will use it.
3. Why does vmstat show no free RAM pages?
AIX uses RAM as a possibly huge disk buffer. If you read a file in the
morning, that file is read into RAM, and left there. If no other
programs need that RAM, that file will be left in RAM until the machine
is halted. This means that if you need the file again, access will be
quick. If you need that RAM, the system will simply use the pages the
file were using. The pages were flushed back to disk earlier. This
means that you can get a huge speedup in disk access if you have enough
RAM. For example, a 200MB database will just ease into RAM if you have
a 256MB system.
4. Since vmstat shows no free RAM pages, am I out of RAM?
Probably not. Since disk files will be "mapped" into RAM, if vmstat
shows lots of RAM pages FREE, then you probably have too much RAM (not
usual on a RISC System/6000)!
5. Shouldn't the "avm" and the "fre" fields from vmstat add up to something?
No. The "avm" field tells you how much "Active Virtual Memory" AIX
thinks you are using. This will closely match the amount of paging
space you are using. This number has *ABSOLUTELY* nothing to do with
the amount of RAM you are using, and does *NOT* include your mapped
files (disk files).
6. Why does the "fre" field from vmstat sometimes show lots of free
RAM pages?
This will happen after an application that used a lot of RAM via
"working" storage (not NFS storage, and not disk file or "persistent"
storage) exits. When RAM pages that were used by working storage (a
program's stack and data area) are no longer needed, there is no need to
leave them around. AIX completely frees these RAM pages. The time to
access these pages versus a RAM page holding a "sync'd" mapped file is
almost identical. Therefore, there is no need to periodically "flush" RAM.
7. Is the vmstat "fre" field useful?
The vmstat "fre" field represents the number of free page frames. If
the number is consistently small (less than 500 pages), this is normal.
If the number is consistently large (greater than 4000 pages), then you
have more memory than you need in this machine.
1.301: How much should I trust the ps memory reports?
From: chukran@austin.VNET.IBM.COM
Using "ps vg" gives a per process tally of memory usage for each running
process. Several fields give memory usage in different units, but these
numbers do not tell the whole story on where all the memory goes.
First of all, the man page for ps does not give an accurate description
of the memory related fields. Here is a better description:
RSS - This tells how much RAM resident memory is currently being used
for the text and data segments for a particular process in units of
kilobytes. (this value will always be a multiple of 4 since memory is
allocated in 4 KB pages).
%MEM - This is the fraction of RSS divided by the total size of RAM for
a particular process. Since RSS is some subset of the total resident
memory usage for a process, the %MEM value will also be lower than actual.
TRS - This tells how much RAM resident memory is currently being used
for the text segment for a particular process in units of kilobytes.
This will always be less than or equal to RSS.
SIZE - This tells how much paging space is allocated for this process
for the text and data segments in units of kilobytes. If the executable
file is on a local filesystem, the page space usage for text is zero.
If the executable is on an NFS filesystem, the page space usage will be
nonzero. This number may be greater than RSS, or it may not, depending
on how much of the process is paged in. The reason RSS can be larger is
that RSS counts text whereas SIZE does not.
TSIZ - This field is absolutely bogus because it is not a multiple of 4
and does not correlate to any of the other fields.
These fields only report on a process text and data segments. Segment
size which cannot be interrogated at this time are:
Text portion of shared libraries (segment 13)
Files that are in use. Open files are cached in memory as
individual segments. The traditional kernel cache buffer
scheme is not used in AIX 3.
Shared data segments created with shmat.
Kernel segments such as kernel segment 0, kernel extension
segments, and virtual memory management segments.
Speaking of kernel segments, the %MEM and RSS report for process zero
are totally bogus for AIX 3.1. The reason why RSS is so big is that the
kernel segment zero is counted twice. For AIX 3.2, this has been
changed, but the whole story is still not known. The RSS value for
process 0 will report a very small number of the swapper private data
segment. It does not report the size of the kernel segment 0, where the
swapper code lives.
In summary, ps is not a very good tool to measure system memory usage.
It can give you some idea where some of the memory goes, but it leaves
too many questions unanswered about the total usage.