[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22e263a3-a1d2-4159-b3c8-44f7a29bace9@gmail.com>
Date: Fri, 1 Aug 2025 12:42:38 +0100
From: Usama Arif <usamaarif642@...il.com>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Cc: linux-fsdevel@...r.kernel.org, corbet@....net, rppt@...nel.org,
surenb@...gle.com, mhocko@...e.com, hannes@...xchg.org, baohua@...nel.org,
shakeel.butt@...ux.dev, riel@...riel.com, ziy@...dia.com,
laoar.shao@...il.com, dev.jain@....com, baolin.wang@...ux.alibaba.com,
npache@...hat.com, lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com,
ryan.roberts@....com, vbabka@...e.cz, jannh@...gle.com,
Arnd Bergmann <arnd@...db.de>, sj@...nel.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH v2 4/5] selftests: prctl: introduce tests for disabling
THPs completely
On 31/07/2025 20:42, David Hildenbrand wrote:
> On 31.07.25 14:27, Usama Arif wrote:
>> The test will set the global system THP setting to never, madvise
>> or always depending on the fixture variant and the 2M setting to
>> inherit before it starts (and reset to original at teardown).
>>
>> This tests if the process can:
>> - successfully set and get the policy to disable THPs completely.
>> - never get a hugepage when the THPs are completely disabled
>> with the prctl, including with MADV_HUGE and MADV_COLLAPSE.
>> - successfully reset the policy of the process.
>> - after reset, only get hugepages with:
>> - MADV_COLLAPSE when policy is set to never.
>> - MADV_HUGE and MADV_COLLAPSE when policy is set to madvise.
>> - always when policy is set to "always".
>> - repeat the above tests in a forked process to make sure
>> the policy is carried across forks.
>>
>> Signed-off-by: Usama Arif <usamaarif642@...il.com>
>> ---
>
> [...]
>
> Looks much better already. Some quirks.
>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <string.h>
>> +#include <unistd.h>
>> +#include <sys/mman.h>
>> +#include <sys/prctl.h>
>> +#include <sys/wait.h>
>> +
>> +#include "../kselftest_harness.h"
>> +#include "thp_settings.h"
>> +#include "vm_util.h"
>> +
>> +static int sz2ord(size_t size, size_t pagesize)
>> +{
>> + return __builtin_ctzll(size / pagesize);
>> +}
>> +
>> +enum thp_collapse_type {
>> + THP_COLLAPSE_NONE,
>> + THP_COLLAPSE_MADV_HUGEPAGE, /* MADV_HUGEPAGE before access */
>> + THP_COLLAPSE_MADV_COLLAPSE, /* MADV_COLLAPSE after access */
>> +};
>> +
>> +enum thp_policy {
>> + THP_POLICY_NEVER,
>> + THP_POLICY_MADVISE,
>> + THP_POLICY_ALWAYS,
>> +};
>
> Couldn't you have reused "enum thp_enabled" end simply never specified the "THP_INHERIT"? Then, you need to do less translation.
yes, introducing this enum was silly. Have removed it for next revision.>
>> +
>> +struct test_results {
>> + int prctl_get_thp_disable;
>
> The result is always one, does that here make sense?
Its 3 in the next patch for PR_THP_DISABLE_EXCEPT_ADVISED :)
I will remove this struct, but I think maybe it might have been a good idea to squash this
with the next patch to show why the struct was useful.
>
>> + int prctl_applied_collapse_none;
>
> "prctl_applied" is a bit confusing. And most of these always have the same value.
>
> Can't we special case the remaining two cases on the current policy and avoid this struct compeltely?
>
The values are different in the next patch when PR_THP_DISABLE_EXCEPT_ADVISED is used.
Just to explain how I came about using this struct test_results (though it doesnt matter as
I will remove it for the next revision):
I wanted to maximise code reuse and only wanted to have one instance of prctl_thp_disable_test.
I actually started with special casing, but went the brute force way of adding too many if else
statements and it was looking quite messy after I added the tests for PR_THP_DISABLE_EXCEPT_ADVISED.
I saw this struct test_results in another kselftest and thought this should make it much better and
extendable.
I have removed struct test_results and changed prctl_thp_disable_test to the following for next revision:
static void prctl_thp_disable_test(struct __test_metadata *const _metadata,
size_t pmdsize, enum thp_enabled thp_policy,
int prctl_flags)
{
ASSERT_EQ(prctl(PR_GET_THP_DISABLE, NULL, NULL, NULL, NULL),
prctl_flags & PR_THP_DISABLE_EXCEPT_ADVISED ? 3 : 1);
/* tests after prctl overrides global policy */
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_NONE, pmdsize), 0);
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_HUGEPAGE, pmdsize),
thp_policy == THP_NEVER || !prctl_flags ? 0 : 1);
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_COLLAPSE, pmdsize),
!prctl_flags ? 0 : 1);
/* Reset to global policy */
ASSERT_EQ(prctl(PR_SET_THP_DISABLE, 0, NULL, NULL, NULL), 0);
/* tests after prctl is cleared, and only global policy is effective */
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_NONE, pmdsize),
thp_policy == THP_ALWAYS ? 1 : 0);
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_HUGEPAGE, pmdsize),
thp_policy == THP_NEVER ? 0 : 1);
ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_COLLAPSE, pmdsize), 1);
}
>
>> + int prctl_applied_collapse_madv_huge;
>> + int prctl_applied_collapse_madv_collapse;
>> + int prctl_removed_collapse_none;
>> + int prctl_removed_collapse_madv_huge;
>> + int prctl_removed_collapse_madv_collapse;
>> +};
>> +
>> +/*
>> + * Function to mmap a buffer, fault it in, madvise it appropriately (before
>> + * page fault for MADV_HUGE, and after for MADV_COLLAPSE), and check if the
>> + * mmap region is huge.
>> + * Returns:
>> + * 0 if test doesn't give hugepage
>> + * 1 if test gives a hugepage
>> + * -errno if mmap fails
>> + */
>> +static int test_mmap_thp(enum thp_collapse_type madvise_buf, size_t pmdsize)
>> +{
>> + char *mem, *mmap_mem;
>> + size_t mmap_size;
>> + int ret;
>> +
>> + /* For alignment purposes, we need twice the THP size. */
>> + mmap_size = 2 * pmdsize;
>> + mmap_mem = (char *)mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
>> + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> + if (mmap_mem == MAP_FAILED)
>> + return -errno;
>> +
>> + /* We need a THP-aligned memory area. */
>> + mem = (char *)(((uintptr_t)mmap_mem + pmdsize) & ~(pmdsize - 1));
>> +
>> + if (madvise_buf == THP_COLLAPSE_MADV_HUGEPAGE)
>> + madvise(mem, pmdsize, MADV_HUGEPAGE);
>> +
>> + /* Ensure memory is allocated */
>> + memset(mem, 1, pmdsize);
>> +
>> + if (madvise_buf == THP_COLLAPSE_MADV_COLLAPSE)
>> + madvise(mem, pmdsize, MADV_COLLAPSE);
>> +
>
> To avoid even mmap_mem to get merged with some other VMA, maybe just do
> before reading the smap here:
>
> /* HACK: make sure we have a separate VMA that we can check reliably. */
> mprotect(mem, pmdsize, PROT_READ);
>
Thanks! Yeah this is a nice hack, have used it in the next revision.
> or
>
> madvise(mem, pmdsize, MADV_DONTFORK);
>
> before reading smaps.
>
> That is probably the easiest approach. The you can drop the lengthy comment and perform a single thp check.
>
>
[...]
Powered by blists - more mailing lists