[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0NzMbwnbvMO7KbUROZq5ne7fhiau49v7oyxwPrYL=P6Q@mail.gmail.com>
Date: Tue, 19 Nov 2024 13:52:00 +0100
From: Jann Horn <jannh@...gle.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-doc@...r.kernel.org, linux-fsdevel@...r.kernel.org,
cgroups@...r.kernel.org, linux-kselftest@...r.kernel.org,
akpm@...ux-foundation.org, corbet@....net, derek.kiernan@....com,
dragan.cvetic@....com, arnd@...db.de, gregkh@...uxfoundation.org,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz, tj@...nel.org,
hannes@...xchg.org, mhocko@...nel.org, roman.gushchin@...ux.dev,
shakeel.butt@...ux.dev, muchun.song@...ux.dev, Liam.Howlett@...cle.com,
vbabka@...e.cz, shuah@...nel.org, vegard.nossum@...cle.com,
vattunuru@...vell.com, schalla@...vell.com, david@...hat.com,
willy@...radead.org, osalvador@...e.de, usama.anjum@...labora.com,
andrii@...nel.org, ryan.roberts@....com, peterx@...hat.com, oleg@...hat.com,
tandersen@...flix.com, rientjes@...gle.com, gthelen@...gle.com,
linux-hardening@...r.kernel.org,
Kernel Hardening <kernel-hardening@...ts.openwall.com>
Subject: Re: [RFCv1 0/6] Page Detective
On Tue, Nov 19, 2024 at 2:30 AM Pasha Tatashin
<pasha.tatashin@...een.com> wrote:
> > Can you point me to where a refcounted reference to the page comes
> > from when page_detective_metadata() calls dump_page_lvl()?
>
> I am sorry, I remembered incorrectly, we are getting reference right
> after dump_page_lvl() in page_detective_memcg() -> folio_try_get(); I
> will move the folio_try_get() to before dump_page_lvl().
>
> > > > So I think dump_page() in its current form is not something we should
> > > > expose to a userspace-reachable API.
> > >
> > > We use dump_page() all over WARN_ONs in MM code where pages might not
> > > be locked, but this is a good point, that while even the existing
> > > usage might be racy, providing a user-reachable API potentially makes
> > > it worse. I will see if I could add some locking before dump_page(),
> > > or make a dump_page variant that does not do dump_mapping().
> >
> > To be clear, I am not that strongly opposed to racily reading data
> > such that the data may not be internally consistent or such; but this
> > is a case of racy use-after-free reads that might end up dumping
> > entirely unrelated memory contents into dmesg. I think we should
> > properly protect against that in an API that userspace can invoke.
> > Otherwise, if we race, we might end up writing random memory contents
> > into dmesg; and if we are particularly unlucky, those random memory
> > contents could be PII or authentication tokens or such.
> >
> > I'm not entirely sure what the right approach is here; I guess it
> > makes sense that when the kernel internally detects corruption,
> > dump_page doesn't take references on pages it accesses to avoid
> > corrupting things further. If you are looking at a page based on a
> > userspace request, I guess you could access the page with the
> > necessary locking to access its properties under the normal locking
> > rules?
>
> I will take reference, as we already do that for memcg purpose, but
> have not included dump_page().
Note that taking a reference on the page does not make all of
dump_page() fine; in particular, my understanding is that
folio_mapping() requires that the page is locked in order to return a
stable pointer, and some of the code in dump_mapping() would probably
also require some other locks - probably at least on the inode and
maybe also on the dentry, I think? Otherwise the inode's dentry list
can probably change concurrently, and the dentry's name pointer can
change too.
Powered by blists - more mailing lists