[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68094f51a7b01_71fe294c6@dwillia2-xfh.jf.intel.com.notmuch>
Date: Wed, 23 Apr 2025 13:36:33 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Naveen N Rao <naveen@...nel.org>, Dan Williams <dan.j.williams@...el.com>
CC: <dave.hansen@...ux.intel.com>, <x86@...nel.org>, Kees Cook
<kees@...nel.org>, Ingo Molnar <mingo@...nel.org>, Vishal Annapurve
<vannapurve@...gle.com>, Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
Nikolay Borisov <nik.borisov@...e.com>, <stable@...r.kernel.org>,
<linux-coco@...ts.linux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 2/2] x86/devmem: Drop /dev/mem access for confidential
guests
Naveen N Rao wrote:
> On Fri, Apr 18, 2025 at 01:04:02PM -0700, Dan Williams wrote:
> > Nikolay reports [1] that accessing BIOS data (first 1MB of the physical
> > address space) via /dev/mem results in an SEPT violation.
> >
> > The cause is ioremap() (via xlate_dev_mem_ptr()) establishes an
> > unencrypted mapping where the kernel had established an encrypted
> > mapping previously.
> >
> > Linux traps read(2) access to the BIOS data area, and returns zero.
> > However, it turns out the kernel fails to enforce the same via mmap(2).
> > This is a hole, and unfortunately userspace has learned to exploit it
> > [2].
> >
> > This means the kernel either needs a mechanism to ensure consistent
> > "encrypted" mappings of this /dev/mem mmap() hole, close the hole by
> > mapping the zero page in the mmap(2) path, block only BIOS data access
> > and let typical STRICT_DEVMEM protect the rest, or disable /dev/mem
> > altogether.
> >
> > The simplest option for now is arrange for /dev/mem to always behave as
> > if lockdown is enabled for confidential guests. Require confidential
> > guest userspace to jettison legacy dependencies on /dev/mem similar to
> > how other legacy mechanisms are jettisoned for confidential operation.
> > Recall that modern methods for BIOS data access are available like
> > /sys/firmware/dmi/tables.
> >
> > Cc: <x86@...nel.org>
> > Cc: Kees Cook <kees@...nel.org>
> > Cc: Ingo Molnar <mingo@...nel.org>
> > Cc: "Naveen N Rao" <naveen@...nel.org>
> > Cc: Vishal Annapurve <vannapurve@...gle.com>
> > Cc: Kirill Shutemov <kirill.shutemov@...ux.intel.com>
> > Link: https://sources.debian.org/src/libdebian-installer/0.125/src/system/subarch-x86-linux.c/?hl=113#L93 [2]
> > Reported-by: Nikolay Borisov <nik.borisov@...e.com>
> > Closes: http://lore.kernel.org/20250318113604.297726-1-nik.borisov@suse.com [1]
> > Fixes: 9aa6ea69852c ("x86/tdx: Make pages shared in ioremap()")
> > Cc: <stable@...r.kernel.org>
> > Acked-by: Dave Hansen <dave.hansen@...ux.intel.com>
> > Signed-off-by: Dan Williams <dan.j.williams@...el.com>
> > ---
> > Changes since v3
> > * Fix a 0day kbuild robot report about missing cc_platform.h include.
> >
> > arch/x86/Kconfig | 4 ++++
> > drivers/char/mem.c | 10 ++++++++++
> > 2 files changed, 14 insertions(+)
> >
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 4b9f378e05f6..bf4528d9fd0a 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -891,6 +891,8 @@ config INTEL_TDX_GUEST
> > depends on X86_X2APIC
> > depends on EFI_STUB
> > depends on PARAVIRT
> > + depends on STRICT_DEVMEM
> > + depends on IO_STRICT_DEVMEM
> > select ARCH_HAS_CC_PLATFORM
> > select X86_MEM_ENCRYPT
> > select X86_MCE
> > @@ -1510,6 +1512,8 @@ config AMD_MEM_ENCRYPT
>
> As far as I know, AMD_MEM_ENCRYPT is for the host SME support. Since
> this is for encrypted guests, should the below dependencies be added to
> CONFIG_SEV_GUEST instead?
>
> Tom?
The placement rationale here was to have the DEVMEM restrictions next to
the ARCH_HAS_CC_PLATFORM 'select' statement which is INTEL_TDX_GUEST
and AMD_MEM_ENCRYPT with SEV_GUEST depending on AMD_MEM_ENCRYPT.
> > bool "AMD Secure Memory Encryption (SME) support"
> > depends on X86_64 && CPU_SUP_AMD
> > depends on EFI_STUB
> > + depends on STRICT_DEVMEM
> > + depends on IO_STRICT_DEVMEM
>
> Can we use 'select' for the dependency on IO_STRICT_DEVMEM, if not both
> the above?
>
> IO_STRICT_DEVMEM in particular is not enabled by default, so applying
> this patch and doing a 'make olddefconfig' disabled AMD_MEM_ENCRYPT,
> which is not so good. Given that IO_STRICT_DEVMEM only depends on
> STRICT_DEVMEM, I think a 'select' is ok.
Agree, that makes sense, and I do not think it will lead to any select
dependency problems given STRICT_DEVMEM is "default y" for x86.
>
> > select DMA_COHERENT_POOL
> > select ARCH_USE_MEMREMAP_PROT
> > select INSTRUCTION_DECODER
> > diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> > index 48839958b0b1..47729606b817 100644
> > --- a/drivers/char/mem.c
> > +++ b/drivers/char/mem.c
> > @@ -30,6 +30,7 @@
> > #include <linux/uio.h>
> > #include <linux/uaccess.h>
> > #include <linux/security.h>
> > +#include <linux/cc_platform.h>
> >
> > #define DEVMEM_MINOR 1
> > #define DEVPORT_MINOR 4
> > @@ -595,6 +596,15 @@ static int open_port(struct inode *inode, struct file *filp)
> > if (rc)
> > return rc;
> >
> > + /*
> > + * Enforce encrypted mapping consistency and avoid unaccepted
> > + * memory conflicts, "lockdown" /dev/mem for confidential
> > + * guests.
> > + */
> > + if (IS_ENABLED(CONFIG_STRICT_DEVMEM) &&
> > + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
> > + return -EPERM;
> > +
> > if (iminor(inode) != DEVMEM_MINOR)
> > return 0;
> >
> >
>
> Otherwise, this looks good to me.
Thanks Naveen, can I take that as an Acked-by?
Powered by blists - more mailing lists