[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67f98e27a799e_7205294e3@dwillia2-xfh.jf.intel.com.notmuch>
Date: Fri, 11 Apr 2025 14:48:23 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Dave Hansen <dave.hansen@...el.com>, Dan Williams
<dan.j.williams@...el.com>, Kees Cook <kees@...nel.org>
CC: <dave.hansen@...ux.intel.com>, Nikolay Borisov <nik.borisov@...e.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/3] devmem: Block mmap access when read/write access
is restricted
Dave Hansen wrote:
> On 4/10/25 21:59, Dan Williams wrote:
> >> I don't think we can just fail the mmap. 🙁
> > For the TVM case the havoc of failing mmap for DMI info is smaller and
> > the recommended fallback for /dev/mem being in accessible is
> > /sys/firmware/dmi/tables. So I feel ok making TVMs take the modern
> > replacement path which is what they would need to do anyway in the
> > lockdown_kernel case. Tom, Dave, what do you think?
>
> Yeah, doing the same as lockdown should be fine.
Note that lockdown fails the open() for /dev/mem and fails the mmap()
for PCI sysfs. So the proposal here would arrange for lockdown to not be
required, but fail mmap() in both cases.
> The other alternative to failing all mmap()s is to allow mmap(PROT_READ)
> to succeed and then just map the zero page.
The only goal of mapping zeroes I can see is to attempt to break legacy
userspace less severely, but as the Debian code search shows, legacy
/dev/mem users have already found the mmap() loophole. So, zeroes for
the TVM case does not help satisfy the requirement to use
/sys/firmware/dmi/tables and other modern methods with
private-memory-safe semantics. I.e. the "success but zero" and "mmap()
failed" cases have the same outcome: legacy software falls back to no
DMI info.
For PCI sysfs resource mmap() the semantics are different. mmap() fails
only when devmem_is_allowed() says "no" *and* the kernel has marked the
range as IORESOURCE_BUSY in the iomem resource tree. That should be
sufficient to allow userpsace PCI drivers in TVMs because the goal here
is to avoid simultaneous mappings with mismatched encryption settings or
allowing userspace access to unaccepted private memory.
As long as the upcoming TDISP code is careful to hold a
request_resource() reservation over attempts to convert PCI MMIO from
shared to private, it should close any potential for mismatched
encryption settings, or access to unaccepted private MMIO.
However to get that behavior of not allowing simultaneous
kernel-ioremap() plus userspace mmap() of a PCI resource the kernel
needs to be built with CONFIG_IO_STRICT_DEVMEM=y. I note, for example
that RHEL does not set that, but Fedora does.
Powered by blists - more mailing lists