lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <djnlhsgjokfx53nvtdhosdfwcoxdl6aaqsmy22ywe22daamsue@uvsbyygxjrhp>
Date: Wed, 22 Oct 2025 11:00:48 +0100
From: Kiryl Shutsemau <kirill@...temov.name>
To: Pedro Falcato <pfalcato@...e.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, 
	Andrew Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>, 
	Matthew Wilcox <willy@...radead.org>, Alexander Viro <viro@...iv.linux.org.uk>, 
	Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>, linux-mm@...ck.org, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/filemap: Implement fast short reads

On Wed, Oct 22, 2025 at 08:38:30AM +0100, Pedro Falcato wrote:
> On Tue, Oct 21, 2025 at 09:13:28PM -1000, Linus Torvalds wrote:
> > On Tue, 21 Oct 2025 at 21:08, Pedro Falcato <pfalcato@...e.de> wrote:
> > >
> > > I think we may still have a problematic (rare, possibly theoretical) race here where:
> > >
> > >    T0                                           T1                                              T3
> > > filemap_read_fast_rcu()    |                                                    |
> > >   folio = xas_load(&xas);  |                                                    |
> > >   /* ... */                |  /* truncate or reclaim frees folio, bumps delete  |
> > >                            |     seq */                                         |       folio_alloc() from e.g secretmem
> > >                            |                                                    |       set_direct_map_invalid_noflush(!!)
> > > memcpy_from_file_folio()   |                                                    |
> > >
> > > We may have to use copy_from_kernel_nofault() here? Or is something else stopping this from happening?
> > 
> > Explain how the sequence count doesn't catch this?
> > 
> > We read the sequence count before we do the xas_load(), and we verify
> > it after we've done the memcpy_from_folio.
> > 
> > The whole *point* is that the copy itself is not race-free. That's
> > *why* we do the sequence count.
> > 
> > And only after the sequence count has been verified do we then copy
> > the result to user space.
> > 
> > So the "maybe this buffer content is garbage" happens, but it only
> > happens in the temporary kernel on-stack buffer, not visibly to the
> > user.
> 
> The problem isn't that the contents might be garbage, but that the direct map
> may be swept from under us, as we don't have a reference to the folio. So the
> folio can be transparently freed under us (as designed), but some user can
> call fun stuff like set_direct_map_invalid_noflush() and we're not handling
> any "oopsie we faulted reading the folio" here. The sequence count doesn't
> help here, because we, uhh, faulted. Does this make sense?
> 
> TL;DR I don't think it's safe to touch the direct map of folios we don't own
> without the seatbelt of a copy_from_kernel_nofault or so.

Makes sense. Thanks for catching this!

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ