[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250901160903.2801339-1-usamaarif642@gmail.com>
Date: Mon, 1 Sep 2025 17:09:03 +0100
From: Usama Arif <usamaarif642@...il.com>
To: alx@...nel.org
Cc: linux-man@...r.kernel.org,
david@...hat.com,
lorenzo.stoakes@...cle.com,
hannes@...xchg.org,
baohua@...nel.org,
shakeel.butt@...ux.dev,
ziy@...dia.com,
laoar.shao@...il.com,
baolin.wang@...ux.alibaba.com,
Liam.Howlett@...cle.com,
linux-kernel@...r.kernel.org,
kernel-team@...a.com,
Usama Arif <usamaarif642@...il.com>
Subject: [PATCH] PR_*ET_THP_DISABLE.2const: document addition of PR_THP_DISABLE_EXCEPT_ADVISED
PR_THP_DISABLE_EXCEPT_ADVISED extended PR_SET_THP_DISABLE to only provide
THPs when advised. IOW, it allows individual processes to opt-out of THP =
"always" into THP = "madvise", without affecting other workloads on the
system. The series has been merged in [1].
This patch documents the changes introduced due to the addition of
PR_THP_DISABLE_EXCEPT_ADVISED flag:
- PR_GET_THP_DISABLE returns a value whose bits indicate how THP-disable
is configured for the calling thread (with or without
PR_THP_DISABLE_EXCEPT_ADVISED).
- PR_SET_THP_DISABLE now uses arg3 to specify whether to disable THP
completely for the process, or disable except madvise
(PR_THP_DISABLE_EXCEPT_ADVISED).
[1] https://lore.kernel.org/all/20250815135549.130506-1-usamaarif642@gmail.com/
Signed-off-by: Usama Arif <usamaarif642@...il.com>
---
man/man2/madvise.2 | 4 +-
man/man2const/PR_GET_THP_DISABLE.2const | 18 ++++++---
man/man2const/PR_SET_THP_DISABLE.2const | 52 +++++++++++++++++++++----
3 files changed, 61 insertions(+), 13 deletions(-)
diff --git a/man/man2/madvise.2 b/man/man2/madvise.2
index 10cc21fa4..6a5290f67 100644
--- a/man/man2/madvise.2
+++ b/man/man2/madvise.2
@@ -373,7 +373,9 @@ nor can it be stack memory or backed by a DAX-enabled device
(unless the DAX device is hot-plugged as System RAM).
The process must also not have
.B PR_SET_THP_DISABLE
-set (see
+set without the
+.B PR_THP_DISABLE_EXCEPT_ADVISED
+flag (see
.BR prctl (2)).
.IP
The
diff --git a/man/man2const/PR_GET_THP_DISABLE.2const b/man/man2const/PR_GET_THP_DISABLE.2const
index 38ff3b370..df239700f 100644
--- a/man/man2const/PR_GET_THP_DISABLE.2const
+++ b/man/man2const/PR_GET_THP_DISABLE.2const
@@ -6,7 +6,7 @@
.SH NAME
PR_GET_THP_DISABLE
\-
-get the state of the "THP disable" flag for the calling thread
+get the state of the "THP disable" flags for the calling thread
.SH LIBRARY
Standard C library
.RI ( libc ,\~ \-lc )
@@ -18,13 +18,21 @@ Standard C library
.B int prctl(PR_GET_THP_DISABLE, 0L, 0L, 0L, 0L);
.fi
.SH DESCRIPTION
-Return the current setting of
-the "THP disable" flag for the calling thread:
-either 1, if the flag is set, or 0, if it is not.
+Returns a value whose bits indicate how THP-disable is configured
+for the calling thread.
+The returned value is interpreted as follows:
+.P
+.nf
+.B "Bits"
+.B " 1 0 Value Description"
+ 0 0 0 No THP-disable behaviour specified.
+ 0 1 1 THP is entirely disabled for this process.
+ 1 1 3 THP-except-advised mode is set for this process.
+.fi
.SH RETURN VALUE
On success,
.BR PR_GET_THP_DISABLE ,
-returns the boolean value described above.
+returns the value described above.
On error, \-1 is returned, and
.I errno
is set to indicate the error.
diff --git a/man/man2const/PR_SET_THP_DISABLE.2const b/man/man2const/PR_SET_THP_DISABLE.2const
index 564e005d4..9f0f17702 100644
--- a/man/man2const/PR_SET_THP_DISABLE.2const
+++ b/man/man2const/PR_SET_THP_DISABLE.2const
@@ -6,7 +6,7 @@
.SH NAME
PR_SET_THP_DISABLE
\-
-set the state of the "THP disable" flag for the calling thread
+set the state of the "THP disable" flags for the calling thread
.SH LIBRARY
Standard C library
.RI ( libc ,\~ \-lc )
@@ -15,24 +15,62 @@ Standard C library
.BR "#include <linux/prctl.h>" " /* Definition of " PR_* " constants */"
.B #include <sys/prctl.h>
.P
-.BI "int prctl(PR_SET_THP_DISABLE, long " flag ", 0L, 0L, 0L);"
+.BI "int prctl(PR_SET_THP_DISABLE, long " thp_disable ", unsigned long " flags ", 0L, 0L);"
.fi
.SH DESCRIPTION
-Set the state of the "THP disable" flag for the calling thread.
+Set the state of the "THP disable" flags for the calling thread.
If
-.I flag
-has a nonzero value, the flag is set, otherwise it is cleared.
+.I thp_disable
+has a nonzero value, the THP disable flag is set according to the value of
+.I flags,
+otherwise it is cleared.
.P
-Setting this flag provides a method
+This
+.BR prctl (2)
+provides a method
for disabling transparent huge pages
for jobs where the code cannot be modified,
and using a malloc hook with
.BR madvise (2)
is not an option (i.e., statically allocated data).
-The setting of the "THP disable" flag is inherited by a child created via
+The setting of the "THP disable" flags is inherited by a child created via
.BR fork (2)
and is preserved across
.BR execve (2).
+.P
+The behavior depends on the value of
+.IR flags:
+.TP
+.B 0
+The
+.BR prctl (2)
+call will disable THPs completely for the process,
+irrespective of global THP controls or
+.BR MADV_COLLAPSE .
+.TP
+.B PR_THP_DISABLE_EXCEPT_ADVISED
+The
+.BR prctl (2)
+call will disable THPs for the process except when the usage of THPs is
+advised.
+Consequently, THPs will only be used when:
+.RS
+.IP \[bu] 2
+Global THP controls are set to "always" or "madvise" and
+.BR madvise (...,
+.BR MADV_HUGEPAGE )
+or
+.BR madvise (...,
+.BR MADV_COLLAPSE )
+is used.
+.IP \[bu]
+Global THP controls are set to "never" and
+.BR madvise (...,
+.BR MADV_COLLAPSE )
+is used.
+This is the same behavior as if THPs would not be disabled on
+a process level.
+.RE
.SH RETURN VALUE
On success,
0 is returned.
--
2.47.3
Powered by blists - more mailing lists