[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cfb5afaa-8636-4c7d-a1a2-2e0a85f9f3d3@lucifer.local>
Date: Mon, 24 Apr 2023 19:22:03 +0100
From: Lorenzo Stoakes <lstoakes@...il.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Christoph Hellwig <hch@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>,
Matthew Wilcox <willy@...radead.org>,
Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>,
Leon Romanovsky <leon@...nel.org>,
Christian Benvenuti <benve@...co.com>,
Nelson Escobar <neescoba@...co.com>,
Bernard Metzler <bmt@...ich.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Bjorn Topel <bjorn@...nel.org>,
Magnus Karlsson <magnus.karlsson@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Lemon <jonathan.lemon@...il.com>,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Christian Brauner <brauner@...nel.org>,
Richard Cochran <richardcochran@...il.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
linux-fsdevel@...r.kernel.org, linux-perf-users@...r.kernel.org,
netdev@...r.kernel.org, bpf@...r.kernel.org,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH v2] mm/gup: disallow GUP writing to file-backed mappings
by default
On Mon, Apr 24, 2023 at 02:36:47PM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 24, 2023 at 03:29:57PM +0100, Lorenzo Stoakes wrote:
> > On Mon, Apr 24, 2023 at 10:39:25AM -0300, Jason Gunthorpe wrote:
> > > On Mon, Apr 24, 2023 at 01:38:49PM +0100, Lorenzo Stoakes wrote:
> > >
> > > > I was being fairly conservative in that list, though we certainly need to
> > > > set the flag for /proc/$pid/mem and ptrace to avoid breaking this
> > > > functionality (I observed breakpoints breaking without it which obviously
> > > > is a no go :). I'm not sure if there's a more general way we could check
> > > > for this though?
> > >
> > > More broadly we should make sure these usages of GUP safe somehow so
> > > that it can reliably write to those types of pages without breaking
> > > the current FS contract..
> > >
> > > I forget exactly, but IIRC, don't you have to hold some kind of page
> > > spinlock while writing to the page memory?
> > >
> >
> > I think perhaps you're thinking of the mm->mmap_lock? Which will be held
> > for the FOLL_GET cases and simply prevent the VMA from disappearing below
> > us but not do much else.
>
> No not mmap_lock, I want to say there is a per-page lock that
> interacts with the write protect, or at worst this needs to use the
> page table spinlocks.
>
> > I wonder whether we should do this check purely for FOLL_PIN to be honest?
> > As this indicates medium to long-term access without mmap_lock held. This
> > would exclude the /proc/$pid/mem and ptrace paths which use gup_remote().
>
> Everything is buggy. FOLL_PIN is part of a someday solution to solve
> it.
>
> > That and a very specific use of uprobes are the only places that use
> > FOLL_GET in this instance and each of them are careful in any case to
> > handle setting the dirty page flag.
>
> That is actually the bug :) Broadly the bug is to make a page dirty
> without holding the right locks to actually dirty it.
>
> Jason
OK I guess you mean the folio lock :) Well there is
unpin_user_pages_dirty_lock() and unpin_user_page_range_dirty_lock() and
also set_page_dirty_lock() (used by __access_remote_vm()) which should
avoid this.
Also __access_remote_vm() which all the ptrace and /proc/$pid/mem use does
set_page_dirty_lock() and only after the user actually writes to it (and
with FOLL_FORCE of course).
None of these are correctly telling a write notify filesystem about the
change, however.
We definitely need to keep ptrace and /proc/$pid/mem functioning correctly,
and I given the privilege levels required I don't think there's a security
issue there?
I do think the mkwrite/write notify file system check is the correct one as
these are the only ones to whom the page being dirty would matter to.
So perhaps we can move forward with:-
- Use mkwrite check rather than shmem/hugetlb.
- ALWAYS enforce not write notify file system if FOLL_LONGTERM (that
removes a lot of the changes here).
- If FOLL_FORCE, then allow this to override the check. This is required
for /proc/$pid/mem and ptrace and is a privileged operation anyway, so
can not cause a security issue.
- Add a FOLL_WILL_UNPIN_DIRTY flag to indicate that the caller will
actually do so (required for process_vm_access cases).
Alternatively we could implement something very cautious and opt-in, like a
FOLL_CHECK_SANITY flag? (starting to feel like I need one of those myself :)
Powered by blists - more mailing lists