lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs0wm4ke2wq.fsf@kernel.org>
Date: Fri, 24 Oct 2025 17:27:01 +0200
From: Pratyush Yadav <pratyush@...nel.org>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Mike Rapoport <rppt@...nel.org>,  Pratyush Yadav <pratyush@...nel.org>,
  akpm@...ux-foundation.org,  brauner@...nel.org,  corbet@....net,
  graf@...zon.com,  jgg@...pe.ca,  linux-kernel@...r.kernel.org,
  linux-kselftest@...r.kernel.org,  linux-mm@...ck.org,
  masahiroy@...nel.org,  ojeda@...nel.org,  rdunlap@...radead.org,
  tj@...nel.org
Subject: Re: [PATCHv7 5/7] kho: don't unpreserve memory during abort

On Fri, Oct 24 2025, Pasha Tatashin wrote:

> On Thu, Oct 23, 2025 at 3:21 AM Mike Rapoport <rppt@...nel.org> wrote:
>>
>> On Wed, Oct 22, 2025 at 01:15:30PM +0200, Pratyush Yadav wrote:
>> > On Tue, Oct 21 2025, Pasha Tatashin wrote:
>> >
>> > > KHO allows clients to preserve memory regions at any point before the
>> > > KHO state is finalized. The finalization process itself involves KHO
>> > > performing its own actions, such as serializing the overall
>> > > preserved memory map.
>> > >
>> > > If this finalization process is aborted, the current implementation
>> > > destroys KHO's internal memory tracking structures
>> > > (`kho_out.ser.track.orders`). This behavior effectively unpreserves
>> > > all memory from KHO's perspective, regardless of whether those
>> > > preservations were made by clients before the finalization attempt
>> > > or by KHO itself during finalization.
>> > >
>> > > This premature unpreservation is incorrect. An abort of the
>> > > finalization process should only undo actions taken by KHO as part of
>> > > that specific finalization attempt. Individual memory regions
>> > > preserved by clients prior to finalization should remain preserved,
>> > > as their lifecycle is managed by the clients themselves. These
>> > > clients might still need to call kho_unpreserve_folio() or
>> > > kho_unpreserve_phys() based on their own logic, even after a KHO
>> > > finalization attempt is aborted.
>> >
>> > I think you also need to update test_kho and reserve_mem to do this
>> > since right now they assume all memory gets unpreserved on failure.
>>
>> I agree.
>
> Hm, this makes no sense to me. So, KHO tried to finalize (i.e.,
> convert xarray to sparse bitmap) and failed (e.g. due to OOM) or
> aborted, so we aborted the finalization. But the preserved memory
> stays preserved, and if user/caller retries finalization and it
> succeeds, the preserved memory will still be passed to the next
> kernel. Why would reserve_mem and test_kho depend on whether KHO
> finalization succeeded or was canceled? It is possible that user
> cancel only to add something else to preservation.

On mainline, the reserve_mem kho_preserve_pages() calls come from the
notifier chain. Any failure on the notifier chain causes an abort and
thus automatically unpreserves all pages that were preserved.

	static int reserve_mem_kho_finalize(struct kho_serialization *ser)
	{
		int err = 0, i;
	
		for (i = 0; i < reserved_mem_count; i++) {
                	[...]
			err |= kho_preserve_pages(page, nr_pages);
		}
	
		err |= kho_preserve_folio(page_folio(kho_fdt));
		err |= kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt));
	
		return notifier_from_errno(err);
	}

If any of the kho_preserve_pages() fails, the notifier block will fail,
cause an abort, and eventually all memory will be unpreserved.

Now that there is no notifier, and thus no abort, the pages must be
unpreserved explicitly before returning.

Similarly, for test_kho, kho_test_notifier() calls kho_preserve_folio()
and expects the abort to clean things up.

Side note: test_kho also preserves folios from kho_test_save_data() and
doesn't clean them up on error, but that is a separate problem that this
series doesn't have to solve.

I think patch 3/7 is the one that actually causes this problem since it
gets rid of the notifier. This is the wrong patch to complain about this
but somehow I thought this is the one that triggers it.

-- 
Regards,
Pratyush Yadav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ