Document lowmem_reserve_ratio
Though the lower_zone_protection was changed to lowmem_reserve_ratio, the document has been not changed. The lowmem_reserve_ratio seems quite hard to estimate, but there is no guidance. This patch is to change document for it. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
committed by
Linus Torvalds
parent
b5beb1caff
commit
7786fa9ac5
@@ -1336,7 +1336,7 @@ legacy_va_layout
|
|||||||
If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel
|
If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel
|
||||||
will use the legacy (2.4) layout for all processes.
|
will use the legacy (2.4) layout for all processes.
|
||||||
|
|
||||||
lower_zone_protection
|
lowmem_reserve_ratio
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
For some specialised workloads on highmem machines it is dangerous for
|
For some specialised workloads on highmem machines it is dangerous for
|
||||||
@@ -1356,25 +1356,71 @@ captured into pinned user memory.
|
|||||||
mechanism will also defend that region from allocations which could use
|
mechanism will also defend that region from allocations which could use
|
||||||
highmem or lowmem).
|
highmem or lowmem).
|
||||||
|
|
||||||
The `lower_zone_protection' tunable determines how aggressive the kernel is
|
The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is
|
||||||
in defending these lower zones. The default value is zero - no
|
in defending these lower zones.
|
||||||
protection at all.
|
|
||||||
|
|
||||||
If you have a machine which uses highmem or ISA DMA and your
|
If you have a machine which uses highmem or ISA DMA and your
|
||||||
applications are using mlock(), or if you are running with no swap then
|
applications are using mlock(), or if you are running with no swap then
|
||||||
you probably should increase the lower_zone_protection setting.
|
you probably should change the lowmem_reserve_ratio setting.
|
||||||
|
|
||||||
The units of this tunable are fairly vague. It is approximately equal
|
The lowmem_reserve_ratio is an array. You can see them by reading this file.
|
||||||
to "megabytes," so setting lower_zone_protection=100 will protect around 100
|
-
|
||||||
megabytes of the lowmem zone from user allocations. It will also make
|
% cat /proc/sys/vm/lowmem_reserve_ratio
|
||||||
those 100 megabytes unavailable for use by applications and by
|
256 256 32
|
||||||
pagecache, so there is a cost.
|
-
|
||||||
|
Note: # of this elements is one fewer than number of zones. Because the highest
|
||||||
|
zone's value is not necessary for following calculation.
|
||||||
|
|
||||||
The effects of this tunable may be observed by monitoring
|
But, these values are not used directly. The kernel calculates # of protection
|
||||||
/proc/meminfo:LowFree. Write a single huge file and observe the point
|
pages for each zones from them. These are shown as array of protection pages
|
||||||
at which LowFree ceases to fall.
|
in /proc/zoneinfo like followings. (This is an example of x86-64 box).
|
||||||
|
Each zone has an array of protection pages like this.
|
||||||
|
|
||||||
A reasonable value for lower_zone_protection is 100.
|
-
|
||||||
|
Node 0, zone DMA
|
||||||
|
pages free 1355
|
||||||
|
min 3
|
||||||
|
low 3
|
||||||
|
high 4
|
||||||
|
:
|
||||||
|
:
|
||||||
|
numa_other 0
|
||||||
|
protection: (0, 2004, 2004, 2004)
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
pagesets
|
||||||
|
cpu: 0 pcp: 0
|
||||||
|
:
|
||||||
|
-
|
||||||
|
These protections are added to score to judge whether this zone should be used
|
||||||
|
for page allocation or should be reclaimed.
|
||||||
|
|
||||||
|
In this example, if normal pages (index=2) are required to this DMA zone and
|
||||||
|
pages_high is used for watermark, the kernel judges this zone should not be
|
||||||
|
used because pages_free(1355) is smaller than watermark + protection[2]
|
||||||
|
(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
|
||||||
|
normal page requirement. If requirement is DMA zone(index=0), protection[0]
|
||||||
|
(=0) is used.
|
||||||
|
|
||||||
|
zone[i]'s protection[j] is calculated by following exprssion.
|
||||||
|
|
||||||
|
(i < j):
|
||||||
|
zone[i]->protection[j]
|
||||||
|
= (total sums of present_pages from zone[i+1] to zone[j] on the node)
|
||||||
|
/ lowmem_reserve_ratio[i];
|
||||||
|
(i = j):
|
||||||
|
(should not be protected. = 0;
|
||||||
|
(i > j):
|
||||||
|
(not necessary, but looks 0)
|
||||||
|
|
||||||
|
The default values of lowmem_reserve_ratio[i] are
|
||||||
|
256 (if zone[i] means DMA or DMA32 zone)
|
||||||
|
32 (others).
|
||||||
|
As above expression, they are reciprocal number of ratio.
|
||||||
|
256 means 1/256. # of protection pages becomes about "0.39%" of total present
|
||||||
|
pages of higher zones on the node.
|
||||||
|
|
||||||
|
If you would like to protect more pages, smaller values are effective.
|
||||||
|
The minimum value is 1 (1/1 -> 100%).
|
||||||
|
|
||||||
page-cluster
|
page-cluster
|
||||||
------------
|
------------
|
||||||
|
Reference in New Issue
Block a user