linux-kernel - Re: [PATCH 6/6] KVM: selftests: Verify that faulting in private guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <diqza52c1im6.fsf@google.com>
Date: Tue, 30 Sep 2025 07:53:37 +0000
From: Ackerley Tng <ackerleytng@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Christian Borntraeger <borntraeger@...ux.ibm.com>, 
	Janosch Frank <frankja@...ux.ibm.com>, Claudio Imbrenda <imbrenda@...ux.ibm.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>, 
	Fuad Tabba <tabba@...gle.com>
Subject: Re: [PATCH 6/6] KVM: selftests: Verify that faulting in private
 guest_memfd memory fails

Sean Christopherson <seanjc@...gle.com> writes:

> On Mon, Sep 29, 2025, Ackerley Tng wrote:
>> Sean Christopherson <seanjc@...gle.com> writes:
>> 
>> 
>> [...snip...]
>> 
>> > -static void test_fault_overflow(int fd, size_t total_size)
>> > +static void *test_fault_sigbus(int fd, size_t size)
>> >  {
>> >  	struct sigaction sa_old, sa_new = {
>> >  		.sa_handler = fault_sigbus_handler,
>> >  	};
>> > -	size_t map_size = total_size * 4;
>> > -	const char val = 0xaa;
>> > -	char *mem;
>> > -	size_t i;
>> > +	void *mem;
>> >  
>> > -	mem = kvm_mmap(map_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd);
>> > +	mem = test_mmap_common(fd, size);
>> >  
>> >  	sigaction(SIGBUS, &sa_new, &sa_old);
>> >  	if (sigsetjmp(jmpbuf, 1) == 0) {
>> > -		memset(mem, 0xaa, map_size);
>> > +		memset(mem, 0xaa, size);
>> >  		TEST_ASSERT(false, "memset() should have triggered SIGBUS.");
>> >  	}
>> >  	sigaction(SIGBUS, &sa_old, NULL);
>> >  
>> > +	return mem;
>> 
>> I think returning the userspace address from a test is a little hard to
>> follow. This one feels even more unexpected because a valid address is
>> being returned (and used) from a test that has sigbus in its name.
>
> Yeah, and it's fugly all around.  If we pass in the "accessible" size, then we
> can reduce the amount of copy+paste, eliminate the weird return and split mmap()
> versus munmap(), and get bonus coverage that reads SIGBUS as well.
>
> How's this look?
>
> static void test_fault_sigbus(int fd, size_t accessible_size, size_t mmap_size)
> {
> 	struct sigaction sa_old, sa_new = {
> 		.sa_handler = fault_sigbus_handler,
> 	};
> 	const uint8_t val = 0xaa;
> 	uint8_t *mem;
> 	size_t i;
>
> 	mem = kvm_mmap(mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd);
>
> 	sigaction(SIGBUS, &sa_new, &sa_old);
> 	if (sigsetjmp(jmpbuf, 1) == 0) {
> 		memset(mem, val, mmap_size);
> 		TEST_FAIL("memset() should have triggered SIGBUS");
> 	}
> 	if (sigsetjmp(jmpbuf, 1) == 0) {
> 		(void)READ_ONCE(mem[accessible_size]);
> 		TEST_FAIL("load at first unaccessible byte should have triggered SIGBUS");
> 	}
> 	sigaction(SIGBUS, &sa_old, NULL);
>
> 	for (i = 0; i < accessible_size; i++)
> 		TEST_ASSERT_EQ(READ_ONCE(mem[i]), val);
>
> 	kvm_munmap(mem, mmap_size);
> }
>
> static void test_fault_overflow(int fd, size_t total_size)
> {
> 	test_fault_sigbus(fd, total_size, total_size * 4);
> }
>

Is it intentional that the same SIGBUS on offset mem + total_size is
triggered twice? The memset would have worked fine until offset mem +
total_size, which is the same SIGBUS case as mem[accessible_size]. Or
was it meant to test that both read and write trigger SIGBUS?

> static void test_fault_private(int fd, size_t total_size)
> {
> 	test_fault_sigbus(fd, 0, total_size);
> }
>

I would prefer more unrolling to avoid mental hoops within test code,
perhaps like (not compile tested):

static void assert_host_fault_sigbus(uint8_t *mem) 
{
 	struct sigaction sa_old, sa_new = {
 		.sa_handler = fault_sigbus_handler,
 	};

 	sigaction(SIGBUS, &sa_new, &sa_old);
 	if (sigsetjmp(jmpbuf, 1) == 0) {
 		(void)READ_ONCE(*mem);
 		TEST_FAIL("Reading %p should have triggered SIGBUS", mem);
 	}
        sigaction(SIGBUS, &sa_old, NULL);
}

static void test_fault_overflow(int fd, size_t total_size)
{
	uint8_t *mem = kvm_mmap(total_size * 2, PROT_READ | PROT_WRITE, MAP_SHARED, fd);
        int i;

 	for (i = 0; i < total_size; i++)
 		TEST_ASSERT_EQ(READ_ONCE(mem[i]), val);

        assert_host_fault_sigbus(mem + total_size);

        kvm_munmap(mem, mmap_size);
}

static void test_fault_private(int fd, size_t total_size)
{
	uint8_t *mem = kvm_mmap(total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd);
        int i;

        assert_host_fault_sigbus(mem);

        kvm_munmap(mem, mmap_size);
}

assert_host_fault_sigbus() can then be flexibly reused for conversion
tests (coming up) at various offsets from the mmap()-ed addresses.

At some point, sigaction, sigsetjmp, etc could perhaps even be further
wrapped. For testing memory_failure() for guest_memfd we will want to
check for SIGBUS on memory failure injection instead of on host fault.

Would be nice if it looked like this (maybe not in this patch series):

+ TEST_ASSERT_WILL_SIGBUS(READ_ONCE(mem[i]))
+ TEST_ASSERT_WILL_SIGBUS(WRITE_ONCE(mem[i]))
+ TEST_ASSERT_WILL_SIGBUS(madvise(MADV_HWPOISON))

>> > +static void test_fault_private(int fd, size_t total_size)
>> > +{
>> > +	void *mem = test_fault_sigbus(fd, total_size);
>> > +
>> > +	kvm_munmap(mem, total_size);
>> > +}
>> > +
>> 
>> Testing that faults fail when GUEST_MEMFD_FLAG_DEFAULT_SHARED is not set
>> is a good idea. Perhaps it could be even clearer if further split up:
>> 
>> + test_mmap_supported()
>>     + kvm_mmap()
>>     + kvm_munmap()
>> + test_mmap_supported_fault_supported()
>>     + kvm_mmap()
>>     + successful accesses to offsets within the size of the fd
>>     + kvm_munmap()
>> + test_mmap_supported_fault_sigbus()
>>     + kvm_mmap()
>>     + expect SIGBUS from accesses to offsets within the size of the fd
>>     + kvm_munmap()
>> 
>> >  static void test_mmap_not_supported(int fd, size_t total_size)
>> >  {
>> >  	char *mem;
>> > @@ -274,9 +299,12 @@ static void __test_guest_memfd(struct kvm_vm *vm, uint64_t flags)
>> >  
>> >  	gmem_test(file_read_write, vm, flags);
>> >  
>> > -	if (flags & GUEST_MEMFD_FLAG_MMAP) {
>> > +	if (flags & GUEST_MEMFD_FLAG_MMAP &&
>> > +	    flags & GUEST_MEMFD_FLAG_DEFAULT_SHARED) {
>> >  		gmem_test(mmap_supported, vm, flags);
>> >  		gmem_test(fault_overflow, vm, flags);
>> > +	} else if (flags & GUEST_MEMFD_FLAG_MMAP) {
>> > +		gmem_test(fault_private, vm, flags);
>> 
>> test_fault_private() makes me think the test is testing for private
>> faults, but there's nothing private about this fault,
>
> It's a user fault on private memory, not sure how else to describe that :-)
> The CoCo shared vs. private and MAP_{SHARED,PRIVATE} collision is unfortunate,
> but I think we should prioritize standardizing on CoCo shared vs. private since
> that is what KVM will care about 99.9% of the time, i.e. in literally everything
> except kvm_gmem_mmap().
>
>> and the fault doesn't even come from the guest.
>
> Sure, but I don't see what that has to do with anything, e.g. fault_overflow()
> isn't a fault from the guest either.
>

Maybe it's the frame of mind I'm working in (conversions), where all
private faults must be from the guest or from KVM. Feel free to ignore this.

>> >  	} else {
>> >  		gmem_test(mmap_not_supported, vm, flags);
>> >  	}
>> 
>> If split up as described above, this could be
>> 
>> 	if (flags & GUEST_MEMFD_FLAG_MMAP &&
>> 	    flags & GUEST_MEMFD_FLAG_DEFAULT_SHARED) {
>> 		gmem_test(mmap_supported_fault_supported, vm, flags);
>> 		gmem_test(fault_overflow, vm, flags);
>> 	} else if (flags & GUEST_MEMFD_FLAG_MMAP) {
>> 		gmem_test(mmap_supported_fault_sigbus, vm, flags);
>
> I find these unintuitive, e.g. is this one "mmap() supported, test fault sigbus",
> or is it "mmap(), test supported fault sigbus".  I also don't like that some of
> the test names describe the _result_ (SIBGUS), where as others describe _what_
> is being tested.
>

I think of the result (SIGBUS) as part of what's being tested. So
test_supported_fault_sigbus() is testing that mmap is supported, and
faulting will result in a SIGBUS.

> In general, I don't like test names that describe the result, because IMO what
> is being tested is far more interesting.  E.g. from a test coverage persective,
> I don't care if attempting to fault in (CoCO) private memory gets SIGBUS versus
> SIGSEGV, but I most definitely care that we have test coverage for the "what".
>

The SIGBUS is part of the contract with userspace and that's also part
of what's being tested IMO.

That said, I agree we don't need sigbus in the name, I guess I just
meant that there are a few layers to test here and I couldn't find a
better name:

1. mmap() succeeds to start with
2. mmap() succeeds, and faulting also succeeds
    + mmap() works, and faulting does not succeed because memory is not
      intended to be accessible to the host
3. mmap() succeed, and faulting also succeeds, but only within the size of
   guest_memfd

> Looking at everything, I think the only that doesn't fit well is the CoW
> scenario.  What if we extract that to its own helper?  That would eliminate the
> ugly test_mmap_common(), 
>

Extracting the CoW scenario is good, thanks!

> So my vote would be to keep things largely the same:
>
> 	if (flags & GUEST_MEMFD_FLAG_MMAP &&
> 	    flags & GUEST_MEMFD_FLAG_DEFAULT_SHARED) {
> 		gmem_test(mmap_supported, vm, flags);
> 		gmem_test(mmap_cow, vm, flags);
> 		gmem_test(fault_overflow, vm, flags);
> 		gmem_test(mbind, vm, flags);
> 		gmem_test(numa_allocation, vm, flags);
> 	} else if (flags & GUEST_MEMFD_FLAG_MMAP) {
> 		gmem_test(fault_private, vm, flags);
> 	} else {
> 		gmem_test(mmap_not_supported, vm, flags);
> 	}