lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee2ccb9f-c689-710d-0297-63d8fc2c98dd@amazon.de>
Date:   Mon, 7 Dec 2020 14:23:06 +0100
From:   Alexander Graf <graf@...zon.de>
To:     "Catangiu, Adrian Costin" <acatan@...zon.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        Jann Horn <jannh@...gle.com>
CC:     Willy Tarreau <w@....eu>,
        "MacCarthaigh, Colm" <colmmacc@...zon.com>,
        "Andy Lutomirski" <luto@...nel.org>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        "Eric Biggers" <ebiggers@...nel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "bonzini@....org" <bonzini@....org>,
        "Singh, Balbir" <sblbir@...zon.com>,
        "Weiss, Radu" <raduweis@...zon.com>,
        "oridgar@...il.com" <oridgar@...il.com>,
        "ghammer@...hat.com" <ghammer@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        "Qemu Developers" <qemu-devel@...gnu.org>,
        KVM list <kvm@...r.kernel.org>,
        "Michal Hocko" <mhocko@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        "Pavel Machek" <pavel@....cz>,
        Linux API <linux-api@...r.kernel.org>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        linux-s390 <linux-s390@...r.kernel.org>,
        "areber@...hat.com" <areber@...hat.com>,
        "Pavel Emelyanov" <ovzxemul@...il.com>,
        Andrey Vagin <avagin@...il.com>,
        "Mike Rapoport" <rppt@...nel.org>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        "Pavel Tikhomirov" <ptikhomirov@...tuozzo.com>,
        "gil@...l.com" <gil@...l.com>,
        "asmehra@...hat.com" <asmehra@...hat.com>,
        "dgunigun@...hat.com" <dgunigun@...hat.com>,
        "vijaysun@...ibm.com" <vijaysun@...ibm.com>
Subject: Re: [PATCH v2] drivers/virt: vmgenid: add vm generation id driver



On 27.11.20 18:17, Catangiu, Adrian Costin wrote:
> 
> On 18/11/2020 12:30, Alexander Graf wrote:
>>
>>
>> On 16.11.20 16:34, Catangiu, Adrian Costin wrote:
>>> - Future improvements
>>>
>>> Ideally we would want the driver to register itself based on devices'
>>> _CID and not _HID, but unfortunately I couldn't find a way to do that.
>>> The problem is that ACPI device matching is done by
>>> '__acpi_match_device()' which exclusively looks at
>>> 'acpi_hardware_id *hwid'.
>>>
>>> There is a path for platform devices to match on _CID when _HID is
>>> 'PRP0001' - but this is not the case for the Qemu vmgenid device.
>>>
>>> Guidance and help here would be greatly appreciated.
>>
>> That one is pretty important IMHO. How about the following (probably
>> pretty mangled) patch? That seems to work for me. The ACPI change
>> would obviously need to be its own stand alone change and needs proper
>> assessment whether it could possibly break any existing systems.
>>
>> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
>> index 1682f8b454a2..452443d79d87 100644
>> --- a/drivers/acpi/bus.c
>> +++ b/drivers/acpi/bus.c
>> @@ -748,7 +748,7 @@ static bool __acpi_match_device(struct acpi_device
>> *device,
>>           /* First, check the ACPI/PNP IDs provided by the caller. */
>>           if (acpi_ids) {
>>               for (id = acpi_ids; id->id[0] || id->cls; id++) {
>> -                if (id->id[0] && !strcmp((char *)id->id, hwid->id))
>> +                if (id->id[0] && !strncmp((char *)id->id, hwid->id,
>> ACPI_ID_LEN - 1))
>>                       goto out_acpi_match;
>>                   if (id->cls && __acpi_match_device_cls(id, hwid))
>>                       goto out_acpi_match;
>> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
>> index 75a787da8aad..0bfa422cf094 100644
>> --- a/drivers/virt/vmgenid.c
>> +++ b/drivers/virt/vmgenid.c
>> @@ -356,7 +356,8 @@ static void vmgenid_acpi_notify(struct acpi_device
>> *device, u32 event)
>>   }
>>
>>   static const struct acpi_device_id vmgenid_ids[] = {
>> -    {"QEMUVGID", 0},
>> +    /* This really is VM_Gen_Counter, but we can only match 8
>> characters */
>> +    {"VM_GEN_C", 0},
>>       {"", 0},
>>   };
>>
> 
> Looks legit. I can propose a patch with it, but how do we validate it
> doesn't break any devices?

Mainly by proposing it and seeing what the ACPI maintainers say. Maybe 
they have a better idea even. At least this explictly nudges them.

> 
> 
>>> +2) ASYNC simplified example::
>>> +
>>> +    void handle_io_on_vmgenfd(int vmgenfd)
>>> +    {
>>> +        unsigned genid;
>>> +
>>> +        // because of VM generation change, we need to rebuild world
>>> +        reseed_app_env();
>>> +
>>> +        // read new gen ID - we need it to confirm we've handled update
>>> +        read(fd, &genid, sizeof(genid));
>>
>> This is racy in case two consecutive snapshots happen. The read needs
>> to go before the reseed.
>>
> Switched them around like you suggest to avoid confusion.
> 
> But I don't see a problem with this race. The idea here is to trigger
> reseed_app_env() which doesn't depend on the generation counter value.
> Whether it gets incremented once or N times is irrelevant, we're just
> interested that we pause execution and reseed before resuming (in
> between these, whether N or M generation changes is the same thing).
> 
>>> +3) Mapped memory polling simplified example::
>>> +
>>> +    /*
>>> +     * app/library function that provides cached secrets
>>> +     */
>>> +    char * safe_cached_secret(app_data_t *app)
>>> +    {
>>> +        char *secret;
>>> +        volatile unsigned *const genid_ptr = get_vmgenid_mapping(app);
>>> +    again:
>>> +        secret = __cached_secret(app);
>>> +

*genid_ptr = 1
cached_genid = 1

>>> +        if (unlikely(*genid_ptr != app->cached_genid)) {

*genid_ptr = 2
cached_genid = 1

>>> +            // rebuild world then confirm the genid update (thru write)
>>> +            rebuild_caches(app);

hypervisor takes another snapshot during rebuild_caches(). Resume path 
bumps genid

>>> +            app->cached_genid = *genid_ptr;

*genid_ptr = 3
cached_genid = 3

>>
>> This is racy again. You need to read the genid before rebuild and set
>> it here.
>>
> I don't see the race. Gen counter is read from volatile mapped mem, on
> detected change we rebuild world, confirm the update back to the driver
> then restart the loop. Loop will break when no more changes happen.

See above. After the outlined course of things, the snapshot will 
contain data that will be identical between 2 snapshots.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ