lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <657f10ed-4e82-4048-98ab-1c4b65349298@redhat.com>
Date: Wed, 26 Feb 2025 12:52:29 +0100
From: David Hildenbrand <david@...hat.com>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Brendan Higgins <brendan.higgins@...ux.dev>,
 David Gow <davidgow@...gle.com>, Rae Moar <rmoar@...gle.com>,
 Andrew Morton <akpm@...ux-foundation.org>, Oscar Salvador
 <osalvador@...e.de>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Vlastimil Babka <vbabka@...e.cz>, Michal Hocko <mhocko@...nel.org>,
 linux-kselftest@...r.kernel.org, kunit-dev@...glegroups.com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Yosry Ahmed <yosry.ahmed@...ux.dev>
Subject: Re: [PATCH RFC 0/4] mm: KUnit tests for the page allocator

On 25.02.25 13:56, Brendan Jackman wrote:
> On Tue, Feb 25, 2025 at 11:01:47AM +0100, David Hildenbrand wrote:
>>> This is an RFC and not a PATCH because:
>>>
>>> 1. I have not taken much care to ensure the isolation is complete.
>>>      There are probably sources of flakiness and nondeterminism in here.
>>>
>>> 2. I suspect the the basic idea might be over-complicated: do we really
>>>      need memory hotplug here? Do we even need the instance of the
>>>      allocator we're testing to actual memory behind the pages it's
>>>      allocating, or could we just hallucinate a new region of vmemmap
>>>      without any of that awkwardness?
>>>
>>>      One significant downside of relying on memory hotplug is that the
>>>      test won't run if we can't hotplug anything out. That means you have
>>>      to fiddle with the platform to even run the tests - see the
>>>      --kernel_args and --qemu_args I had to add to my kunit.py command
>>>      above.
>>>
>>>      So yeah, other suggestions welcome.
>>>
>>>      2b. I'm not very confident I'm using the hotplug API properly.
>>
>> Me neither ;)
>>
>> Dynamically adding memory to that "fake" node is certainly interesting, but
>> which guarantees do we have that some other features (page migration, memory
>> offlining, page reporting ...) don't interact in weird ways with this "fake"
>> node? Adding special-casing all over the place for that feels wrong. I
>> assume this is point 1. you note above.
> 
> Yeah, basically this is the big downside. Changing the system we're
> trying to test in order to make it testable can't be avoided entirely,
> but I am also pretty unhappy with sprinkling "if (node_isolated(node))"
> everywhere.
> 
> I guess the ideal approach is one where, instead of having to modify
> the meaning of node_data, we just support replacing it completely,
> e.g:
> 
> struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
> 		int preferred_nid, nodemask_t *nodemask,
> 		struct pagelist_data *node_data)
> {
> 	struct alloc_context ac = { .node_data = node_data };
> 
> 	// ...
> }
> 
> Ideally this could be done in such a way that it disappears completely
> outside of KUnit builds, the interface should be private and we'd
> wanna get rid of any unnecessary pointer chasing with stuff like:
> 
> #ifdef CONFIG_PAGE_ALLOC_KUNIT_TESTS
> static inline struct ac_node_data(struct alloc_context *ac, int node)
> {
> 	return ac->node_data[node];
> }
> #else
> #define ac_node_data(ac, node) (NODE_DATA(node))
> #endif
> 
> I initially rejected this approach because it felt "too intrusive",
> but now that I've actually written this RFC I think it could be less
> intrusive than the node_isolated() thing I've proposed here.
> 
> The most obvious challenges I can see there are:
> 
> - There might be places that page_alloc.c calls out to that care about
>    node_data but where we really don't want to plumb the alloc_context
>    through (maybe vmscan.c is already such a place)?
> 
> - I dunno how many more such helpers we'd need beyond ac_node_data(),
>    like do we need ac_nodes_possible_mask() etc etc etc?
> 
> But maybe worth a try - can you see any obvious reason this idea is
> stupid?
> 
>> So I don't quite love the idea on first sight ... but I haven't grasped all
>> details of the full picture yet I'm afraid.
> 
> Do you have any thoughts about "making up" memory instead of
> hot-unplugging real memory for test usage? That might simplify things
> a bit?
 > > It seems possible that very little mm code cares if the memory we're
> managing actually exists. (For ASI code we did briefly experiment with
> tracking information about free pages in the page itself, but it's
> pretty sketchy and the presence of debug_pagealloc makes me think
> nobody does it today).

At least when it comes to the buddy, only page zeroing+poisoning should 
access actual page content.

So making up memory might actually work in quite some setups, assuming 
that it will never get allocated.

The "complicated" thing is still that we are trying to test parts of the 
buddy in a well-controlled way while other kernel infrastructure is 
using the buddy in rather uncontrolled ways.


> 
> There might be arch-specific issues there, but for unit tests it
> seems OK if they don't work on every ISA.

Just pointing it out: for memblock tests (tools/testing/memblock/) we 
actually compile memblock.c to be used in a user space application, 
stubbing all external function calls etc such that we get the basics 
running.

It'd probably be quite some work to get page_alloc.c into a similar 
shape, likely we'd have to move a lot of unrelated-for-the tests stuff, 
and think about how to handle some nasty details like pcp etc. Just 
wondering, did you think about that option as well?

The nice thing about such an approach is that we can test the allcator 
without any possible side effects from the running system.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ