lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <42ce8dcc-0139-4dd7-9bef-bf3efa93849a@redhat.com>
Date: Tue, 27 May 2025 13:06:42 +0200
From: David Hildenbrand <david@...hat.com>
To: Barry Song <21cnbao@...il.com>
Cc: aarcange@...hat.com, akpm@...ux-foundation.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, lokeshgidra@...gle.com,
 peterx@...hat.com, ryncsn@...il.com, surenb@...gle.com
Subject: Re: [BUG]userfaultfd_move fails to move a folio when swap-in occurs
 concurrently with swap-out

>>
>>          EBUSY
>>                 The pages in the source virtual memory range are either
>>                 pinned or not exclusive to the process. The kernel might
>>                 only perform lightweight checks for detecting whether the
>>                 pages are exclusive. To make the operation more likely to
>>                 succeed, KSM should be disabled, fork() should be avoided
>>                 or MADV_DONTFORK should be configured for the source
>>                virtual memory area before fork().
>>
>> Note the "lightweight" and "more likely to succeed".
>>
> 
> Initially, my point was that an exclusive folio (single-process case)
> should be movable.

Yeah, I would wish that we wouldn't need that PAE hack in the swapin code.

I was asking myself if we could just ... wait for writeback to end in 
that case?

I mean, if we would have to swap in the folio we would also have to wait 
for disk I/O ... so here we would also have to wait for disk I/O.

We could either wait for writeback before mapping the folio, or set the 
PAE bit and map the page R/O, to then wait for writeback during write 
faults.

The latter has the downside that we have to handle it with more 
complexity during write faults (check if page is under writeback, then 
check if we require this sync I/O during write faults even though PAE is 
set ...).

> Now I understand this isn’t a bug, but rather a compromise made due
> to implementation constraints.

That is a good summary!

> Perhaps the remaining value of this report is that it helped better
> understand scenarios beyond fork where a move might also fail.
> 
> I truly appreciate your time and your clear analysis.

YW :)

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ