[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee2ccb9f-c689-710d-0297-63d8fc2c98dd@amazon.de>
Date: Mon, 7 Dec 2020 14:23:06 +0100
From: Alexander Graf <graf@...zon.de>
To: "Catangiu, Adrian Costin" <acatan@...zon.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Jann Horn <jannh@...gle.com>
CC: Willy Tarreau <w@....eu>,
"MacCarthaigh, Colm" <colmmacc@...zon.com>,
"Andy Lutomirski" <luto@...nel.org>,
"Theodore Y. Ts'o" <tytso@....edu>,
"Eric Biggers" <ebiggers@...nel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
kernel list <linux-kernel@...r.kernel.org>,
"Woodhouse, David" <dwmw@...zon.co.uk>,
"bonzini@....org" <bonzini@....org>,
"Singh, Balbir" <sblbir@...zon.com>,
"Weiss, Radu" <raduweis@...zon.com>,
"oridgar@...il.com" <oridgar@...il.com>,
"ghammer@...hat.com" <ghammer@...hat.com>,
Jonathan Corbet <corbet@....net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
"Qemu Developers" <qemu-devel@...gnu.org>,
KVM list <kvm@...r.kernel.org>,
"Michal Hocko" <mhocko@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
"Pavel Machek" <pavel@....cz>,
Linux API <linux-api@...r.kernel.org>,
"mpe@...erman.id.au" <mpe@...erman.id.au>,
linux-s390 <linux-s390@...r.kernel.org>,
"areber@...hat.com" <areber@...hat.com>,
"Pavel Emelyanov" <ovzxemul@...il.com>,
Andrey Vagin <avagin@...il.com>,
"Mike Rapoport" <rppt@...nel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
"Pavel Tikhomirov" <ptikhomirov@...tuozzo.com>,
"gil@...l.com" <gil@...l.com>,
"asmehra@...hat.com" <asmehra@...hat.com>,
"dgunigun@...hat.com" <dgunigun@...hat.com>,
"vijaysun@...ibm.com" <vijaysun@...ibm.com>
Subject: Re: [PATCH v2] drivers/virt: vmgenid: add vm generation id driver
On 27.11.20 18:17, Catangiu, Adrian Costin wrote:
>
> On 18/11/2020 12:30, Alexander Graf wrote:
>>
>>
>> On 16.11.20 16:34, Catangiu, Adrian Costin wrote:
>>> - Future improvements
>>>
>>> Ideally we would want the driver to register itself based on devices'
>>> _CID and not _HID, but unfortunately I couldn't find a way to do that.
>>> The problem is that ACPI device matching is done by
>>> '__acpi_match_device()' which exclusively looks at
>>> 'acpi_hardware_id *hwid'.
>>>
>>> There is a path for platform devices to match on _CID when _HID is
>>> 'PRP0001' - but this is not the case for the Qemu vmgenid device.
>>>
>>> Guidance and help here would be greatly appreciated.
>>
>> That one is pretty important IMHO. How about the following (probably
>> pretty mangled) patch? That seems to work for me. The ACPI change
>> would obviously need to be its own stand alone change and needs proper
>> assessment whether it could possibly break any existing systems.
>>
>> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
>> index 1682f8b454a2..452443d79d87 100644
>> --- a/drivers/acpi/bus.c
>> +++ b/drivers/acpi/bus.c
>> @@ -748,7 +748,7 @@ static bool __acpi_match_device(struct acpi_device
>> *device,
>> /* First, check the ACPI/PNP IDs provided by the caller. */
>> if (acpi_ids) {
>> for (id = acpi_ids; id->id[0] || id->cls; id++) {
>> - if (id->id[0] && !strcmp((char *)id->id, hwid->id))
>> + if (id->id[0] && !strncmp((char *)id->id, hwid->id,
>> ACPI_ID_LEN - 1))
>> goto out_acpi_match;
>> if (id->cls && __acpi_match_device_cls(id, hwid))
>> goto out_acpi_match;
>> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
>> index 75a787da8aad..0bfa422cf094 100644
>> --- a/drivers/virt/vmgenid.c
>> +++ b/drivers/virt/vmgenid.c
>> @@ -356,7 +356,8 @@ static void vmgenid_acpi_notify(struct acpi_device
>> *device, u32 event)
>> }
>>
>> static const struct acpi_device_id vmgenid_ids[] = {
>> - {"QEMUVGID", 0},
>> + /* This really is VM_Gen_Counter, but we can only match 8
>> characters */
>> + {"VM_GEN_C", 0},
>> {"", 0},
>> };
>>
>
> Looks legit. I can propose a patch with it, but how do we validate it
> doesn't break any devices?
Mainly by proposing it and seeing what the ACPI maintainers say. Maybe
they have a better idea even. At least this explictly nudges them.
>
>
>>> +2) ASYNC simplified example::
>>> +
>>> + void handle_io_on_vmgenfd(int vmgenfd)
>>> + {
>>> + unsigned genid;
>>> +
>>> + // because of VM generation change, we need to rebuild world
>>> + reseed_app_env();
>>> +
>>> + // read new gen ID - we need it to confirm we've handled update
>>> + read(fd, &genid, sizeof(genid));
>>
>> This is racy in case two consecutive snapshots happen. The read needs
>> to go before the reseed.
>>
> Switched them around like you suggest to avoid confusion.
>
> But I don't see a problem with this race. The idea here is to trigger
> reseed_app_env() which doesn't depend on the generation counter value.
> Whether it gets incremented once or N times is irrelevant, we're just
> interested that we pause execution and reseed before resuming (in
> between these, whether N or M generation changes is the same thing).
>
>>> +3) Mapped memory polling simplified example::
>>> +
>>> + /*
>>> + * app/library function that provides cached secrets
>>> + */
>>> + char * safe_cached_secret(app_data_t *app)
>>> + {
>>> + char *secret;
>>> + volatile unsigned *const genid_ptr = get_vmgenid_mapping(app);
>>> + again:
>>> + secret = __cached_secret(app);
>>> +
*genid_ptr = 1
cached_genid = 1
>>> + if (unlikely(*genid_ptr != app->cached_genid)) {
*genid_ptr = 2
cached_genid = 1
>>> + // rebuild world then confirm the genid update (thru write)
>>> + rebuild_caches(app);
hypervisor takes another snapshot during rebuild_caches(). Resume path
bumps genid
>>> + app->cached_genid = *genid_ptr;
*genid_ptr = 3
cached_genid = 3
>>
>> This is racy again. You need to read the genid before rebuild and set
>> it here.
>>
> I don't see the race. Gen counter is read from volatile mapped mem, on
> detected change we rebuild world, confirm the update back to the driver
> then restart the loop. Loop will break when no more changes happen.
See above. After the outlined course of things, the snapshot will
contain data that will be identical between 2 snapshots.
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Powered by blists - more mailing lists