lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <145eaa0b-a118-1d80-7f2c-d73f0d3f1db0@amazon.de>
Date:   Mon, 7 Dec 2020 14:11:04 +0100
From:   Alexander Graf <graf@...zon.de>
To:     "Catangiu, Adrian Costin" <acatan@...zon.com>,
        Dmitry Safonov <0x7f454c46@...il.com>
CC:     Mike Rapoport <rppt@...nel.org>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        Jann Horn <jannh@...gle.com>, Willy Tarreau <w@....eu>,
        "MacCarthaigh, Colm" <colmmacc@...zon.com>,
        Andy Lutomirski <luto@...nel.org>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        Eric Biggers <ebiggers@...nel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "bonzini@....org" <bonzini@....org>,
        "Singh, Balbir" <sblbir@...zon.com>,
        "Weiss, Radu" <raduweis@...zon.com>,
        "oridgar@...il.com" <oridgar@...il.com>,
        "ghammer@...hat.com" <ghammer@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Qemu Developers <qemu-devel@...gnu.org>,
        KVM list <kvm@...r.kernel.org>,
        Michal Hocko <mhocko@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Pavel Machek <pavel@....cz>,
        Linux API <linux-api@...r.kernel.org>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        linux-s390 <linux-s390@...r.kernel.org>,
        "areber@...hat.com" <areber@...hat.com>,
        Pavel Emelyanov <ovzxemul@...il.com>,
        Andrey Vagin <avagin@...il.com>,
        Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
        "gil@...l.com" <gil@...l.com>,
        "asmehra@...hat.com" <asmehra@...hat.com>,
        "dgunigun@...hat.com" <dgunigun@...hat.com>,
        "vijaysun@...ibm.com" <vijaysun@...ibm.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [PATCH v3] drivers/virt: vmgenid: add vm generation id driver



On 27.11.20 19:26, Catangiu, Adrian Costin wrote:
> - Background
> 
> The VM Generation ID is a feature defined by Microsoft (paper:
> http://go.microsoft.com/fwlink/?LinkId=260709) and supported by
> multiple hypervisor vendors.
> 
> The feature is required in virtualized environments by apps that work
> with local copies/caches of world-unique data such as random values,
> uuids, monotonically increasing counters, etc.
> Such apps can be negatively affected by VM snapshotting when the VM
> is either cloned or returned to an earlier point in time.
> 
> The VM Generation ID is a simple concept meant to alleviate the issue
> by providing a unique ID that changes each time the VM is restored
> from a snapshot. The hw provided UUID value can be used to
> differentiate between VMs or different generations of the same VM.
> 
> - Problem
> 
> The VM Generation ID is exposed through an ACPI device by multiple
> hypervisor vendors but neither the vendors or upstream Linux have no
> default driver for it leaving users to fend for themselves.
> 
> Furthermore, simply finding out about a VM generation change is only
> the starting point of a process to renew internal states of possibly
> multiple applications across the system. This process could benefit
> from a driver that provides an interface through which orchestration
> can be easily done.
> 
> - Solution
> 
> This patch is a driver that exposes a monotonic incremental Virtual
> Machine Generation u32 counter via a char-dev FS interface. The FS
> interface provides sync and async VmGen counter updates notifications.
> It also provides VmGen counter retrieval and confirmation mechanisms.
> 
> The generation counter and the interface through which it is exposed
> are available even when there is no acpi device present.
> 
> When the device is present, the hw provided UUID is not exposed to
> userspace, it is internally used by the driver to keep accounting for
> the exposed VmGen counter. The counter starts from zero when the
> driver is initialized and monotonically increments every time the hw
> UUID changes (the VM generation changes).
> On each hw UUID change, the new hypervisor-provided UUID is also fed
> to the kernel RNG.
> 
> If there is no acpi vmgenid device present, the generation changes are
> not driven by hw vmgenid events but can be driven by software through
> a dedicated driver ioctl.
> 
> This patch builds on top of Or Idgar <oridgar@...il.com>'s proposal
> https://lkml.org/lkml/2018/3/1/498
> 
> - Future improvements
> 
> Ideally we would want the driver to register itself based on devices'
> _CID and not _HID, but unfortunately I couldn't find a way to do that.
> The problem is that ACPI device matching is done by
> '__acpi_match_device()' which exclusively looks at
> 'acpi_hardware_id *hwid'.
> 
> There is a path for platform devices to match on _CID when _HID is
> 'PRP0001' - but this is not the case for the Qemu vmgenid device.
> 
> Guidance and help here would be greatly appreciated.
> 
> Signed-off-by: Adrian Catangiu <acatan@...zon.com>
> 
> ---
> 
> v1 -> v2:
> 
>    - expose to userspace a monotonically increasing u32 Vm Gen Counter
>      instead of the hw VmGen UUID
>    - since the hw/hypervisor-provided 128-bit UUID is not public
>      anymore, add it to the kernel RNG as device randomness
>    - insert driver page containing Vm Gen Counter in the user vma in
>      the driver's mmap handler instead of using a fault handler
>    - turn driver into a misc device driver to auto-create /dev/vmgenid
>    - change ioctl arg to avoid leaking kernel structs to userspace
>    - update documentation
>    - various nits
>    - rebase on top of linus latest
> 
> v2 -> v3:
> 
>    - separate the core driver logic and interface, from the ACPI device.
>      The ACPI vmgenid device is now one possible backend.
>    - fix issue when timeout=0 in VMGENID_WAIT_WATCHERS
>    - add locking to avoid races between fs ops handlers and hw irq
>      driven generation updates
>    - change VMGENID_WAIT_WATCHERS ioctl so if the current caller is
>      outdated or a generation change happens while waiting (thus making
>      current caller outdated), the ioctl returns -EINTR to signal the
>      user to handle event and retry. Fixes blocking on oneself.
>    - add VMGENID_FORCE_GEN_UPDATE ioctl conditioned by
>      CAP_CHECKPOINT_RESTORE capability, through which software can force
>      generation bump.
> ---
>   Documentation/virt/vmgenid.rst | 240 +++++++++++++++++++++++
>   drivers/virt/Kconfig           |  17 ++
>   drivers/virt/Makefile          |   1 +
>   drivers/virt/vmgenid.c         | 435
> +++++++++++++++++++++++++++++++++++++++++
>   include/uapi/linux/vmgenid.h   |  14 ++
>   5 files changed, 707 insertions(+)
>   create mode 100644 Documentation/virt/vmgenid.rst
>   create mode 100644 drivers/virt/vmgenid.c
>   create mode 100644 include/uapi/linux/vmgenid.h
> 

[...]

> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index 80c5f9c1..5d5f37b 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -13,6 +13,23 @@ menuconfig VIRT_DRIVERS
>   
>   if VIRT_DRIVERS
>   
> +config VMGENID
> +    tristate "Virtual Machine Generation ID driver"
> +    depends on ACPI

I think you want to split the KConfig bit into two now. One for generic 
/dev/vmgenid support and another one for ACPI_VMGENID to automatically 
bump revisions when the hypervisor indicates it.

In fact, you can probably make this two separate patches with two 
separate files (read: kernel modules) even. The generic code can just 
export symbols to bump the system genid.

I'm also not fully convinced that calling the generic mechanism 
"vmgenid" is still accurate at this point. Can you think of a better 
name? "System Generation ID", so "sysgenid" maybe?

> +    default N
> +    help
> +      This is a Virtual Machine Generation ID driver which provides
> +      a virtual machine generation counter. The driver exposes FS ops
> +      on /dev/vmgenid through which it can provide information and
> +      notifications on VM generation changes that happen on snapshots
> +      or cloning.
> +      This enables applications and libraries that store or cache
> +      sensitive information, to know that they need to regenerate it
> +      after process memory has been exposed to potential copying.
> +
> +      To compile this driver as a module, choose M here: the
> +      module will be called vmgenid.
> +
>   config FSL_HV_MANAGER
>       tristate "Freescale hypervisor management driver"
>       depends on FSL_SOC

[...]

> +    case VMGENID_FORCE_GEN_UPDATE:
> +        if (!checkpoint_restore_ns_capable(current_user_ns()))
> +            return -EACCES;
> +        vmgenid_bump_generation();

I think this is racy and needs to be slightly different. Imagine the 
following:

   - container is running with genid 5
   - I take a snapshot of the container
   - Target system has genid 4
   - I resume the container
   - I call the genid update (genid = 5)

Then the container still sees genid 5, so *maybe* it won't adapt to the 
new environment. This will depend on whether the container gets enough 
time to adjust to genid=4 before we bump it to 5.

How about we pass a "bump, but not to this value" argument to the ioctl? 
Then it would look like this:

   - container is running with genid 5
   - I take a snapshot of the container and its genid (5)
   - Target system has genid 4
   - I resume the container
   - I call the genid update with avoid=5 (so we bump genid to 6)

Now all processes in the system will adapt to genid=6, including the 
resumed container.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ