[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZzzeK8Fi_GlXPzjc@casper.infradead.org>
Date: Tue, 19 Nov 2024 18:51:23 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Jann Horn <jannh@...gle.com>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-doc@...r.kernel.org, linux-fsdevel@...r.kernel.org,
cgroups@...r.kernel.org, linux-kselftest@...r.kernel.org,
akpm@...ux-foundation.org, corbet@....net, derek.kiernan@....com,
dragan.cvetic@....com, arnd@...db.de, gregkh@...uxfoundation.org,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
tj@...nel.org, hannes@...xchg.org, mhocko@...nel.org,
roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
muchun.song@...ux.dev, Liam.Howlett@...cle.com, vbabka@...e.cz,
shuah@...nel.org, vegard.nossum@...cle.com, vattunuru@...vell.com,
schalla@...vell.com, david@...hat.com, osalvador@...e.de,
usama.anjum@...labora.com, andrii@...nel.org, ryan.roberts@....com,
peterx@...hat.com, oleg@...hat.com, tandersen@...flix.com,
rientjes@...gle.com, gthelen@...gle.com,
linux-hardening@...r.kernel.org,
Kernel Hardening <kernel-hardening@...ts.openwall.com>
Subject: Re: [RFCv1 0/6] Page Detective
On Tue, Nov 19, 2024 at 01:52:00PM +0100, Jann Horn wrote:
> > I will take reference, as we already do that for memcg purpose, but
> > have not included dump_page().
>
> Note that taking a reference on the page does not make all of
> dump_page() fine; in particular, my understanding is that
> folio_mapping() requires that the page is locked in order to return a
> stable pointer, and some of the code in dump_mapping() would probably
> also require some other locks - probably at least on the inode and
> maybe also on the dentry, I think? Otherwise the inode's dentry list
> can probably change concurrently, and the dentry's name pointer can
> change too.
First important thing is that we snapshot the page. So while we may
have a torn snapshot of the page, it can't change under us any more,
so we don't have to worry about it being swizzled one way and then
swizzled back.
Second thing is that I think using folio_mapping() is actually wrong.
We don't want the swap mapping if it's an anon page that's in the
swapcache. We'd be fine just doing mapping = folio->mapping (we'd need
to add a check for movable, but I think that's fine). Anyway, we know
the folio isn't ksm or anon at the point that we call dump_mapping()
because there's a chain of "else" statements. So I think we're fine
because we can't switch between anon & file while holding a refcount.
Having a refcount on the folio will prevent the folio from being allocated
to anything else again. It will not protect the mapping from being torn
down (the folio can be truncated from the mapping, then the mapping can
be freed, and the memory reused). As you say, the dentry can be renamed
as well.
This patch series makes me nervous. I'd rather see it done as a bpf
script or drgn script, but if it is going to be done in C, I'd really
like to see more auditing of the safety here. It feels like the kind
of hack that one deploys internally to debug a hard-to-hit condition,
rather than the kind of code that we like to ship upstream.
Powered by blists - more mailing lists