[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05c9f2a9-accb-e0de-aac7-b212adac7eb2@amazon.com>
Date: Fri, 25 Feb 2022 14:57:38 +0100
From: Alexander Graf <graf@...zon.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>, <kvm@...r.kernel.org>,
<linux-crypto@...r.kernel.org>, <linux-hyperv@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
CC: <adrian@...ity.io>, <ardb@...nel.org>, <ben@...portsystems.com>,
<berrange@...hat.com>, <colmmacc@...zon.com>,
<decui@...rosoft.com>, <dwmw@...zon.co.uk>, <ebiggers@...nel.org>,
<ehabkost@...hat.com>, <gregkh@...uxfoundation.org>,
<haiyangz@...rosoft.com>, <imammedo@...hat.com>,
<jannh@...gle.com>, <kys@...rosoft.com>, <lersek@...hat.com>,
<linux@...inikbrodowski.net>, <mst@...hat.com>,
<qemu-devel@...gnu.org>, <raduweis@...zon.com>,
<sthemmin@...rosoft.com>, <tytso@....edu>, <wei.liu@...nel.org>
Subject: Re: [PATCH v4] virt: vmgenid: introduce driver for reinitializing RNG
on VM fork
On 25.02.22 13:48, Jason A. Donenfeld wrote:
>
> VM Generation ID is a feature from Microsoft, described at
> <https://go.microsoft.com/fwlink/?LinkId=260709>, and supported by
> Hyper-V and QEMU. Its usage is described in Microsoft's RNG whitepaper,
> <https://aka.ms/win10rng>, as:
>
> If the OS is running in a VM, there is a problem that most
> hypervisors can snapshot the state of the machine and later rewind
> the VM state to the saved state. This results in the machine running
> a second time with the exact same RNG state, which leads to serious
> security problems. To reduce the window of vulnerability, Windows
> 10 on a Hyper-V VM will detect when the VM state is reset, retrieve
> a unique (not random) value from the hypervisor, and reseed the root
> RNG with that unique value. This does not eliminate the
> vulnerability, but it greatly reduces the time during which the RNG
> system will produce the same outputs as it did during a previous
> instantiation of the same VM state.
>
> Linux has the same issue, and given that vmgenid is supported already by
> multiple hypervisors, we can implement more or less the same solution.
> So this commit wires up the vmgenid ACPI notification to the RNG's newly
> added add_vmfork_randomness() function.
>
> It can be used from qemu via the `-device vmgenid,guid=auto` parameter.
> After setting that, use `savevm` in the monitor to save the VM state,
> then quit QEMU, start it again, and use `loadvm`. That will trigger this
> driver's notify function, which hands the new UUID to the RNG. This is
> described in <https://git.qemu.org/?p=qemu.git;a=blob;f=docs/specs/vmgenid.txt>.
> And there are hooks for this in libvirt as well, described in
> <https://libvirt.org/formatdomain.html#general-metadata>.
>
> Note, however, that the treatment of this as a UUID is considered to be
> an accidental QEMU nuance, per
> <https://github.com/libguestfs/virt-v2v/blob/master/docs/vm-generation-id-across-hypervisors.txt>,
> so this driver simply treats these bytes as an opaque 128-bit binary
> blob, as per the spec. This doesn't really make a difference anyway,
> considering that's how it ends up when handed to the RNG in the end.
>
> Cc: Adrian Catangiu <adrian@...ity.io>
> Cc: Daniel P. Berrangé <berrange@...hat.com>
> Cc: Dominik Brodowski <linux@...inikbrodowski.net>
> Cc: Ard Biesheuvel <ardb@...nel.org>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Reviewed-by: Laszlo Ersek <lersek@...hat.com>
> Signed-off-by: Jason A. Donenfeld <Jason@...c4.com>
> ---
> Changes v3->v4:
> - Add this driver to MAINTAINERS, per Ard's request.
> Note: I didn't really want to do this at first, because I was hoping the
> original Amazon team looking into this last year would step up. But it seems
> like that team has moved on, and anyway I've basically rewritten the driver
> from scratch at this point -- not a single line of the original exists --
> and so I guess I'll maintain it myself. Adding Greg to the CC for his ack on
> this.
> - Don't use a static global state in case there are multiple instances.
> - Use devm_memremap instead of the acpi internal functions.
> - Default to being modular instead of a built-in, as apparently this is
> udev-able.
>
> MAINTAINERS | 1 +
> drivers/virt/Kconfig | 9 ++++
> drivers/virt/Makefile | 1 +
> drivers/virt/vmgenid.c | 112 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 123 insertions(+)
> create mode 100644 drivers/virt/vmgenid.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 777cd6fa2b3d..a10997e15146 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16211,6 +16211,7 @@ M: Jason A. Donenfeld <Jason@...c4.com>
> T: git https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git
> S: Maintained
> F: drivers/char/random.c
> +F: drivers/virt/vmgenid.c
>
> RAPIDIO SUBSYSTEM
> M: Matt Porter <mporter@...nel.crashing.org>
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index 8061e8ef449f..5596c7313f59 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -13,6 +13,15 @@ menuconfig VIRT_DRIVERS
>
> if VIRT_DRIVERS
>
> +config VMGENID
> + tristate "Virtual Machine Generation ID driver"
> + default m
> + depends on ACPI
> + help
> + Say Y here to use the hypervisor-provided Virtual Machine Generation ID
> + to reseed the RNG when the VM is cloned. This is highly recommended if
> + you intend to do any rollback / cloning / snapshotting of VMs.
> +
> config FSL_HV_MANAGER
> tristate "Freescale hypervisor management driver"
> depends on FSL_SOC
> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> index 3e272ea60cd9..108d0ffcc9aa 100644
> --- a/drivers/virt/Makefile
> +++ b/drivers/virt/Makefile
> @@ -4,6 +4,7 @@
> #
>
> obj-$(CONFIG_FSL_HV_MANAGER) += fsl_hypervisor.o
> +obj-$(CONFIG_VMGENID) += vmgenid.o
> obj-y += vboxguest/
>
> obj-$(CONFIG_NITRO_ENCLAVES) += nitro_enclaves/
> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
> new file mode 100644
> index 000000000000..e3dd4afb33c6
> --- /dev/null
> +++ b/drivers/virt/vmgenid.c
> @@ -0,0 +1,112 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2022 Jason A. Donenfeld <Jason@...c4.com>. All Rights Reserved.
> + *
> + * The "Virtual Machine Generation ID" is exposed via ACPI and changes when a
> + * virtual machine forks or is cloned. This driver exists for shepherding that
> + * information to random.c.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/acpi.h>
> +#include <linux/random.h>
> +
> +ACPI_MODULE_NAME("vmgenid");
> +
> +enum { VMGENID_SIZE = 16 };
> +
> +struct vmgenid_state {
> + u8 *next_id;
> + u8 this_id[VMGENID_SIZE];
> +};
> +
> +static int vmgenid_acpi_add(struct acpi_device *device)
> +{
> + struct acpi_buffer parsed = { ACPI_ALLOCATE_BUFFER };
> + struct vmgenid_state *state;
> + union acpi_object *obj;
> + phys_addr_t phys_addr;
> + acpi_status status;
> + int ret = 0;
> +
> + state = devm_kmalloc(&device->dev, sizeof(*state), GFP_KERNEL);
> + if (!state)
> + return -ENOMEM;
> +
> + status = acpi_evaluate_object(device->handle, "ADDR", NULL, &parsed);
> + if (ACPI_FAILURE(status)) {
> + ACPI_EXCEPTION((AE_INFO, status, "Evaluating ADDR"));
> + return -ENODEV;
> + }
> + obj = parsed.pointer;
> + if (!obj || obj->type != ACPI_TYPE_PACKAGE || obj->package.count != 2 ||
> + obj->package.elements[0].type != ACPI_TYPE_INTEGER ||
> + obj->package.elements[1].type != ACPI_TYPE_INTEGER) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + phys_addr = (obj->package.elements[0].integer.value << 0) |
> + (obj->package.elements[1].integer.value << 32);
> + state->next_id = devm_memremap(&device->dev, phys_addr, VMGENID_SIZE, MEMREMAP_WB);
> + if (!state->next_id) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + memcpy(state->this_id, state->next_id, sizeof(state->this_id));
> + add_device_randomness(state->this_id, sizeof(state->this_id));
Please expose the vmgenid via /sysfs so that user space even remotely
has a chance to check if it's been cloned.
> +
> + device->driver_data = state;
> +
> +out:
> + ACPI_FREE(parsed.pointer);
> + return ret;
> +}
> +
> +static void vmgenid_acpi_notify(struct acpi_device *device, u32 event)
> +{
> + struct vmgenid_state *state = acpi_driver_data(device);
> + u8 old_id[VMGENID_SIZE];
> +
> + memcpy(old_id, state->this_id, sizeof(old_id));
> + memcpy(state->this_id, state->next_id, sizeof(state->this_id));
> + if (!memcmp(old_id, state->this_id, sizeof(old_id)))
> + return;
> + add_vmfork_randomness(state->this_id, sizeof(state->this_id));
> +}
> +
> +static const struct acpi_device_id vmgenid_ids[] = {
> + { "VMGENID", 0 },
> + { "QEMUVGID", 0 },
According to the VMGenID spec[1], you can only rely on _CID and _DDN for
matching. They both contain "VM_Gen_Counter". The list above contains
_HID values which are not an official identifier for the VMGenID device.
IIRC the ACPI device match logic does match _CID in addition to _HID.
However, it is limited to 8 characters. Let me paste an experimental
hack I did back then to do the _CID matching instead.
[1]
https://download.microsoft.com/download/3/1/C/31CFC307-98CA-4CA5-914C-D9772691E214/VirtualMachineGenerationID.docx
Alex
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 1682f8b454a2..452443d79d87 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -748,7 +748,7 @@ static bool __acpi_match_device(struct acpi_device
*device,
/* First, check the ACPI/PNP IDs provided by the caller. */
if (acpi_ids) {
for (id = acpi_ids; id->id[0] || id->cls; id++) {
- if (id->id[0] && !strcmp((char *)id->id, hwid->id))
+ if (id->id[0] && !strncmp((char *)id->id, hwid->id,
ACPI_ID_LEN - 1))
goto out_acpi_match;
if (id->cls && __acpi_match_device_cls(id, hwid))
goto out_acpi_match;
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index 75a787da8aad..0bfa422cf094 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -356,7 +356,8 @@ static void vmgenid_acpi_notify(struct acpi_device
*device, u32 event)
}
static const struct acpi_device_id vmgenid_ids[] = {
- {"QEMUVGID", 0},
+ /* This really is VM_Gen_Counter, but we can only match 8 characters */
+ {"VM_GEN_C", 0},
{"", 0},
};
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Powered by blists - more mailing lists