lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7bfd0822-5687-4ddc-9637-0cedd404c34e@redhat.com>
Date: Mon, 27 Oct 2025 17:06:34 +0100
From: David Hildenbrand <david@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
 Hugh Dickins <hughd@...gle.com>
Cc: Kiryl Shutsemau <kirill@...temov.name>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Matthew Wilcox <willy@...radead.org>,
 Alexander Viro <viro@...iv.linux.org.uk>,
 Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
 Yang Shi <shy828301@...il.com>, Dave Chinner <david@...morbit.com>,
 Suren Baghdasaryan <surenb@...gle.com>, linux-mm@...ck.org,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/filemap: Implement fast short reads

On 27.10.25 16:50, Linus Torvalds wrote:
> On Mon, 27 Oct 2025 at 03:49, Hugh Dickins <hughd@...gle.com> wrote:
>>
>> This makes a fundamental change to speculative page cache assumptions.
> 
> Yes, but I'm a bit surprised by people who find that scary.
> 
> The page cache does *much* more scary things elsewhere, particularly
> the whole folio_try_get() dance (in filemap_get_entry() and other
> places).
> 
> I suspect you ignore that just because it's been that way forever, so
> you're comfortable with it.
> 
> I'd argue that that is much *much* more subtle because it means that
> somebody may be incrementing the page count of a page that has already
> been re-allocated by somebody else.
> 
> Talk about cognitive load: that code makes you think that "hey, the
> tryget means that if it has been released, we don't get a ref to it",
> because that's how many of our *other* speculative RCU accesses do in
> fact work.
> 
> But that's not how the page cache works, exactly because freeing isn't
> actually RCU-delayed.
> 
> So while the code visually follows the exact same pattern as some
> other "look up speculatively under RCU, skip if it's not there any
> more", it actually does exactly the same thing as the "copy data under
> RCU, then check later if it was ok". Except it does "increment
> refcount under RCU, then check later if it was actually valid".
> 
> That said, I wonder if we might not consider making page cache freeing
> be RCU-delayed. This has come up before (exactly *because* of that
> "folio_try_get()").
> 
> Because while I am pretty sure that filemap_get_entry() is fine (and a
> number of other core users), I'm not convinced that some of the other
> users of folio_try_get() are necessarily aware of just how subtle that
> thing is.
> 
> Anyway, I'm certainly not going to push that patch very hard.

I will sleep better at night if we can guarantee that we are reading 
from a folio that has not been reused in the meantime -- or reading 
random other memory as I raised in my other mail.

So I really wish that we can defer optimizing this to freeing folios 
under RCU instead.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ