lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Nov 2020 19:36:49 +0100
From:   Alexander Graf <graf@...zon.de>
To:     Mike Rapoport <rppt@...nel.org>
CC:     Christian Borntraeger <borntraeger@...ibm.com>,
        "Catangiu, Adrian Costin" <acatan@...zon.com>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        Jann Horn <jannh@...gle.com>, Willy Tarreau <w@....eu>,
        "MacCarthaigh, Colm" <colmmacc@...zon.com>,
        Andy Lutomirski <luto@...nel.org>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        Eric Biggers <ebiggers@...nel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "bonzini@....org" <bonzini@....org>,
        "Singh, Balbir" <sblbir@...zon.com>,
        "Weiss, Radu" <raduweis@...zon.com>,
        "oridgar@...il.com" <oridgar@...il.com>,
        "ghammer@...hat.com" <ghammer@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Qemu Developers <qemu-devel@...gnu.org>,
        KVM list <kvm@...r.kernel.org>,
        Michal Hocko <mhocko@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Pavel Machek <pavel@....cz>,
        Linux API <linux-api@...r.kernel.org>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        linux-s390 <linux-s390@...r.kernel.org>,
        "areber@...hat.com" <areber@...hat.com>,
        Pavel Emelyanov <ovzxemul@...il.com>,
        Andrey Vagin <avagin@...il.com>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
        "gil@...l.com" <gil@...l.com>,
        "asmehra@...hat.com" <asmehra@...hat.com>,
        "dgunigun@...hat.com" <dgunigun@...hat.com>,
        "vijaysun@...ibm.com" <vijaysun@...ibm.com>
Subject: Re: [PATCH v2] drivers/virt: vmgenid: add vm generation id driver



On 19.11.20 18:38, Mike Rapoport wrote:
> 
> On Thu, Nov 19, 2020 at 01:51:18PM +0100, Alexander Graf wrote:
>>
>>
>> On 19.11.20 13:02, Christian Borntraeger wrote:
>>>
>>> On 16.11.20 16:34, Catangiu, Adrian Costin wrote:
>>>> - Background
>>>>
>>>> The VM Generation ID is a feature defined by Microsoft (paper:
>>>> http://go.microsoft.com/fwlink/?LinkId=260709) and supported by
>>>> multiple hypervisor vendors.
>>>>
>>>> The feature is required in virtualized environments by apps that work
>>>> with local copies/caches of world-unique data such as random values,
>>>> uuids, monotonically increasing counters, etc.
>>>> Such apps can be negatively affected by VM snapshotting when the VM
>>>> is either cloned or returned to an earlier point in time.
>>>>
>>>> The VM Generation ID is a simple concept meant to alleviate the issue
>>>> by providing a unique ID that changes each time the VM is restored
>>>> from a snapshot. The hw provided UUID value can be used to
>>>> differentiate between VMs or different generations of the same VM.
>>>>
>>>> - Problem
>>>>
>>>> The VM Generation ID is exposed through an ACPI device by multiple
>>>> hypervisor vendors but neither the vendors or upstream Linux have no
>>>> default driver for it leaving users to fend for themselves.
>>>
>>> I see that the qemu implementation is still under discussion. What is
>>
>> Uh, the ACPI Vmgenid device emulation is in QEMU since 2.9.0 :).
>>
>>> the status of the other existing implementations. Do they already exist?
>>> In other words is ACPI a given?
>>> I think the majority of this driver could be used with just a different
>>> backend for platforms without ACPI so in any case we could factor out
>>> the backend (acpi, virtio, whatever) but if we are open we could maybe
>>> start with something else.
>>
>> I agree 100%. I don't think we really need a new framework in the kernel for
>> that. We can just have for example an s390x specific driver that also
>> provides the same notification mechanism through a device node that is also
>> named "/dev/vmgenid", no?
>>
>> Or alternatively we can split the generic part of this driver as soon as a
>> second one comes along and then have both driver include that generic logic.
>>
>> The only piece where I'm unsure is how this will interact with CRIU.
> 
> To C/R applications that use /dev/vmgenid CRIU need to be aware of it.
> Checkpointing and restoring withing the same "VM generation" shouldn't be
> a problem, but IMHO, making restore work after genid bump could be
> challenging.
> 
> Alex, what scenario involving CRIU did you have in mind?

You can in theory run into the same situation with containers that this 
patch is solving for virtual machines. You could for example do a 
snapshot of a prewarmed Java runtime with CRIU to get full JIT speeds 
starting from the first request.

That however means you run into the problem of predictable randomness again.

> 
>> Can containers emulate ioctls and device nodes?
> 
> Containers do not emulate ioctls but they can have /dev/vmgenid inside
> the container, so applications can use it the same way as outside the
> container.

Hm. I suppose we could add a CAP_ADMIN ioctl interface to /dev/vmgenid 
(when container people get to the point of needing it) that sets the 
generation to "at least X". That way on restore, you could just call 
that with "generation at snapshot"+1.

That also means we need to have this interface available without virtual 
machines then though, right?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ