lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67f98e27a799e_7205294e3@dwillia2-xfh.jf.intel.com.notmuch>
Date: Fri, 11 Apr 2025 14:48:23 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Dave Hansen <dave.hansen@...el.com>, Dan Williams
	<dan.j.williams@...el.com>, Kees Cook <kees@...nel.org>
CC: <dave.hansen@...ux.intel.com>, Nikolay Borisov <nik.borisov@...e.com>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/3] devmem: Block mmap access when read/write access
 is restricted

Dave Hansen wrote:
> On 4/10/25 21:59, Dan Williams wrote:
> >> I don't think we can just fail the mmap. 🙁
> > For the TVM case the havoc of failing mmap for DMI info is smaller and
> > the recommended fallback for /dev/mem being in accessible is
> > /sys/firmware/dmi/tables. So I feel ok making TVMs take the modern
> > replacement path which is what they would need to do anyway in the
> > lockdown_kernel case. Tom, Dave, what do you think?
> 
> Yeah, doing the same as lockdown should be fine.

Note that lockdown fails the open() for /dev/mem and fails the mmap()
for PCI sysfs. So the proposal here would arrange for lockdown to not be
required, but fail mmap() in both cases.

> The other alternative to failing all mmap()s is to allow mmap(PROT_READ)
> to succeed and then just map the zero page.

The only goal of mapping zeroes I can see is to attempt to break legacy
userspace less severely, but as the Debian code search shows, legacy
/dev/mem users have already found the mmap() loophole. So, zeroes for
the TVM case does not help satisfy the requirement to use
/sys/firmware/dmi/tables and other modern methods with
private-memory-safe semantics. I.e. the "success but zero" and "mmap()
failed" cases have the same outcome: legacy software falls back to no
DMI info.

For PCI sysfs resource mmap() the semantics are different. mmap() fails
only when devmem_is_allowed() says "no" *and* the kernel has marked the
range as IORESOURCE_BUSY in the iomem resource tree. That should be
sufficient to allow userpsace PCI drivers in TVMs because the goal here
is to avoid simultaneous mappings with mismatched encryption settings or
allowing userspace access to unaccepted private memory.

As long as the upcoming TDISP code is careful to hold a
request_resource() reservation over attempts to convert PCI MMIO from
shared to private, it should close any potential for mismatched
encryption settings, or access to unaccepted private MMIO.

However to get that behavior of not allowing simultaneous
kernel-ioremap() plus userspace mmap() of a PCI resource the kernel
needs to be built with CONFIG_IO_STRICT_DEVMEM=y. I note, for example
that RHEL does not set that, but Fedora does.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ