[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8eb933f921c9dfe4c9b1b304e8f8fa4fbc249d84.camel@linux.ibm.com>
Date: Thu, 06 May 2021 10:05:27 -0700
From: James Bottomley <jejb@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Rapoport <rppt@...nel.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
Andy Lutomirski <luto@...nel.org>,
Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
Catalin Marinas <catalin.marinas@....com>,
Christopher Lameter <cl@...ux.com>,
Dan Williams <dan.j.williams@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Elena Reshetova <elena.reshetova@...el.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
Matthew Garrett <mjg59@...f.ucam.org>,
Mark Rutland <mark.rutland@....com>,
Michal Hocko <mhocko@...e.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Peter Zijlstra <peterz@...radead.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Roman Gushchin <guro@...com>,
Shakeel Butt <shakeelb@...gle.com>,
Shuah Khan <shuah@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tycho Andersen <tycho@...ho.ws>, Will Deacon <will@...nel.org>,
linux-api@...r.kernel.org, linux-arch@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-nvdimm@...ts.01.org, linux-riscv@...ts.infradead.org,
x86@...nel.org
Subject: Re: [PATCH v18 0/9] mm: introduce memfd_secret system call to
create "secret" memory areas
On Thu, 2021-05-06 at 18:45 +0200, David Hildenbrand wrote:
> On 06.05.21 17:26, James Bottomley wrote:
> > On Wed, 2021-05-05 at 12:08 -0700, Andrew Morton wrote:
> > > On Wed, 3 Mar 2021 18:22:00 +0200 Mike Rapoport <rppt@...nel.org
> > > >
> > > wrote:
> > >
> > > > This is an implementation of "secret" mappings backed by a file
> > > > descriptor.
> > > >
> > > > The file descriptor backing secret memory mappings is created
> > > > using a dedicated memfd_secret system call The desired
> > > > protection mode for the memory is configured using flags
> > > > parameter of the system call. The mmap() of the file descriptor
> > > > created with memfd_secret() will create a "secret" memory
> > > > mapping. The pages in that mapping will be marked as not
> > > > present in the direct map and will be present only in the page
> > > > table of the owning mm.
> > > >
> > > > Although normally Linux userspace mappings are protected from
> > > > other users, such secret mappings are useful for environments
> > > > where a hostile tenant is trying to trick the kernel into
> > > > giving them access to other tenants mappings.
> > >
> > > I continue to struggle with this and I don't recall seeing much
> > > enthusiasm from others. Perhaps we're all missing the value
> > > point and some additional selling is needed.
> > >
> > > Am I correct in understanding that the overall direction here is
> > > to protect keys (and perhaps other things) from kernel
> > > bugs? That if the kernel was bug-free then there would be no
> > > need for this feature? If so, that's a bit sad. But realistic I
> > > guess.
> >
> > Secret memory really serves several purposes. The "increase the
> > level of difficulty of secret exfiltration" you describe. And, as
> > you say, if the kernel were bug free this wouldn't be necessary.
> >
> > But also:
> >
> > 1. Memory safety for use space code. Once the secret memory is
> > allocated, the user can't accidentally pass it into the
> > kernel to be
> > transmitted somewhere.
>
> That's an interesting point I didn't realize so far.
>
> > 2. It also serves as a basis for context protection of virtual
> > machines, but other groups are working on this aspect, and
> > it is
> > broadly similar to the secret exfiltration from the kernel
> > problem.
> >
>
> I was wondering if this also helps against CPU microcode issues like
> spectre and friends.
It can for VMs, but not really for the user space secret memory use
cases ... the in-kernel mitigations already present are much more
effective.
>
> > > Is this intended to protect keys/etc after the attacker has
> > > gained the ability to run arbitrary kernel-mode code? If so,
> > > that seems optimistic, doesn't it?
> >
> > Not exactly: there are many types of kernel attack, but mostly the
> > attacker either manages to effect a privilege escalation to root or
> > gets the ability to run a ROP gadget. The object of this code is
> > to be completely secure against root trying to extract the secret
> > (some what similar to the lockdown idea), thus defeating privilege
> > escalation and to provide "sufficient" protection against ROP
> > gadget.
>
> What stops "root" from mapping /dev/mem and reading that memory?
/dev/mem uses the direct map for the copy at least for read/write, so
it gets a fault in the same way root trying to use ptrace does. I
think we've protected mmap, but Mike would know that better than I.
> IOW, would we want to enforce "CONFIG_STRICT_DEVMEM" with
> CONFIG_SECRETMEM?
Unless there's a corner case I haven't thought of, I don't think it
adds much. However, doing a full lockdown on a public system where
users want to use secret memory is best practice I think (except I
think you want it to be the full secure boot lockdown to close all the
root holes).
> Also, there is a way to still read that memory when root by
>
> 1. Having kdump active (which would often be the case, but maybe not
> to dump user pages )
> 2. Triggering a kernel crash (easy via proc as root)
> 3. Waiting for the reboot after kump() created the dump and then
> reading the content from disk.
Anything that can leave physical memory intact but boot to a kernel
where the missing direct map entry is restored could theoretically
extract the secret. However, it's not exactly going to be a stealthy
extraction ...
> Or, as an attacker, load a custom kexec() kernel and read memory
> from the new environment. Of course, the latter two are advanced
> mechanisms, but they are possible when root. We might be able to
> mitigate, for example, by zeroing out secretmem pages before booting
> into the kexec kernel, if we care :)
I think we could handle it by marking the region, yes, and a zero on
shutdown might be useful ... it would prevent all warm reboot type
attacks.
James
Powered by blists - more mailing lists