[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200930103507.GK2142832@kernel.org>
Date: Wed, 30 Sep 2020 13:35:07 +0300
From: Mike Rapoport <rppt@...nel.org>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "mark.rutland@....com" <mark.rutland@....com>,
"david@...hat.com" <david@...hat.com>,
"cl@...ux.com" <cl@...ux.com>, "hpa@...or.com" <hpa@...or.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"will@...nel.org" <will@...nel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"idan.yaniv@....com" <idan.yaniv@....com>,
"kirill@...temov.name" <kirill@...temov.name>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"rppt@...ux.ibm.com" <rppt@...ux.ibm.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"bp@...en8.de" <bp@...en8.de>,
"willy@...radead.org" <willy@...radead.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"luto@...nel.org" <luto@...nel.org>,
"shuah@...nel.org" <shuah@...nel.org>,
"arnd@...db.de" <arnd@...db.de>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
"x86@...nel.org" <x86@...nel.org>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"Reshetova, Elena" <elena.reshetova@...el.com>,
"palmer@...belt.com" <palmer@...belt.com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"mtk.manpages@...il.com" <mtk.manpages@...il.com>,
"tycho@...ho.ws" <tycho@...ho.ws>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"paul.walmsley@...ive.com" <paul.walmsley@...ive.com>,
"jejb@...ux.ibm.com" <jejb@...ux.ibm.com>
Subject: Re: [PATCH v6 3/6] mm: introduce memfd_secret system call to create
"secret" memory areas
On Tue, Sep 29, 2020 at 08:06:03PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-09-29 at 16:06 +0300, Mike Rapoport wrote:
> > On Tue, Sep 29, 2020 at 04:58:44AM +0000, Edgecombe, Rick P wrote:
> > > On Thu, 2020-09-24 at 16:29 +0300, Mike Rapoport wrote:
> > > > Introduce "memfd_secret" system call with the ability to create
> > > > memory
> > > > areas visible only in the context of the owning process and not
> > > > mapped not
> > > > only to other processes but in the kernel page tables as well.
> > > >
> > > > The user will create a file descriptor using the memfd_secret()
> > > > system call
> > > > where flags supplied as a parameter to this system call will
> > > > define
> > > > the
> > > > desired protection mode for the memory associated with that file
> > > > descriptor.
> > > >
> > > > Currently there are two protection modes:
> > > >
> > > > * exclusive - the memory area is unmapped from the kernel direct
> > > > map
> > > > and it
> > > > is present only in the page tables of the owning
> > > > mm.
> > >
> > > Seems like there were some concerns raised around direct map
> > > efficiency, but in case you are going to rework this...how does
> > > this
> > > memory work for the existing kernel functionality that does things
> > > like
> > > this?
> > >
> > > get_user_pages(, &page);
> > > ptr = kmap(page);
> > > foo = *ptr;
> > >
> > > Not sure if I'm missing something, but I think apps could cause the
> > > kernel to access a not-present page and oops.
> >
> > The idea is that this memory should not be accessible by the kernel,
> > so
> > the sequence you describe should indeed fail.
> >
> > Probably oops would be to noisy and in this case the report needs to
> > be
> > less verbose.
>
> I was more concerned that it could cause kernel instabilities.
I think kernel recovers nicely from such sort of page fault, at least on
x86.
> I see, so it should not be accessed even at the userspace address? I
> wonder if it should be prevented somehow then. At least
> get_user_pages() should be prevented I think. Blocking copy_*_user()
> access might not be simple.
>
> I'm also not so sure that a user would never have any possible reason
> to copy data from this memory into the kernel, even if it's just
> convenience. In which case a user setup could break if a specific
> kernel implementation switched to get_user_pages()/kmap() from using
> copy_*_user(). So seems maybe a bit thorny without fully blocking
> access from the kernel, or deprecating that pattern.
>
> You should probably call out these "no passing data to/from the kernel"
> expectations, unless I missed them somewhere.
You are right, I should have been more explicit in the description of
the expected behavoir.
Our thinking was that copy_*user() would work in the context of the
process that "owns" the secretmem and gup() would not allow access in
general, unless requested with certail (yet another) FOLL_ flag.
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists