linux-kernel - Re: [RFC PATCH] block, fs: use FOLL_LONGTERM as gup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e8996592-f7ba-4926-8556-1fe7534038dc@redhat.com>
Date: Sun, 16 Mar 2025 00:00:09 +0100
From: David Hildenbrand <david@...hat.com>
To: John Hubbard <jhubbard@...dia.com>, Christoph Hellwig
 <hch@...radead.org>, Matthew Wilcox <willy@...radead.org>,
 Jason Gunthorpe <jgg@...dia.com>
Cc: Sooyong Suk <s.suk@...sung.com>, viro@...iv.linux.org.uk,
 linux-kernel@...r.kernel.org, akpm@...ux-foundation.org, linux-mm@...ck.org,
 jaewon31.kim@...il.com, spssyr@...il.com, Zi Yan <ziy@...dia.com>
Subject: Re: [RFC PATCH] block, fs: use FOLL_LONGTERM as gup_flags for direct
 IO

On 15.03.25 02:04, John Hubbard wrote:
> On 3/13/25 3:49 PM, David Hildenbrand wrote:
>> On 12.03.25 16:21, Christoph Hellwig wrote:
>>> On Fri, Mar 07, 2025 at 08:23:08PM +0000, Matthew Wilcox wrote:
>>>> Howver, the problem is real.
>>>
>>> What is the problem?
>>
>> I think the problem is the CMA allocation failure, not the latency.
>>
>> "if a large amount of direct IO is requested constantly, this can make
>> pages in CMA pageblocks pinned and unable to migrate outside of the
>> pageblock"
>>
>> We'd need a more reliable way to make CMA allocation -> page migration
>> make progress. For example, after we isolated the pageblocks and
>> migration starts doing its thing, we could disallow any further GUP
>> pins. (e.g., make GUP spin or wait for migration to end)
>>
>> We could detect in GUP code that a folio is soon expected to be migrated
>> by checking the pageblock (isolated) and/or whether the folio is locked.
>>
> 
> Jason Gunthorpe and Matthew both had some ideas about how to fix this [1],
> which were very close (maybe the same) to what you're saying here: sleep
> and spin in an killable loop.
> 
> It turns out to be a little difficult to do this--I had trouble making
> the folio's "has waiters" bit work for this, for example. And then...squirrel!
> 
> However, I still believe, so far, this is the right approach. I'm just not
> sure which thing to wait on, exactly.

Zi Yan has a series to convert the "isolate" state of pageblocks to a 
separate pageblock bit; it could be considered a lock-bit. Currently, 
it's essentially the migratetype being MIGRATE_ISOLATE.

As soon as a pageblock is isolated, one must be prepared for contained 
pages/folios to get migrated. The folio lock will only be grabbed once 
actually trying to migrate a folio IIRC, so it might not be the best 
choice: especially considering allocations that span many pageblocks.

So maybe one would need a "has waiters" bit per pageblock, so relevant 
users (e.g., GUP) could wait on the isolate bit getting cleared.

-- 
Cheers,

David / dhildenb