[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ded94f0f-9c60-38b3-6217-03d3c0edd613@amazon.com>
Date: Fri, 27 Nov 2020 19:17:53 +0200
From: "Catangiu, Adrian Costin" <acatan@...zon.com>
To: Alexander Graf <graf@...zon.de>,
Christian Borntraeger <borntraeger@...ibm.com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Jann Horn <jannh@...gle.com>
CC: Willy Tarreau <w@....eu>,
"MacCarthaigh, Colm" <colmmacc@...zon.com>,
Andy Lutomirski <luto@...nel.org>,
"Theodore Y. Ts'o" <tytso@....edu>,
Eric Biggers <ebiggers@...nel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
kernel list <linux-kernel@...r.kernel.org>,
"Woodhouse, David" <dwmw@...zon.co.uk>,
"bonzini@....org" <bonzini@....org>,
"Singh, Balbir" <sblbir@...zon.com>,
"Weiss, Radu" <raduweis@...zon.com>,
"oridgar@...il.com" <oridgar@...il.com>,
"ghammer@...hat.com" <ghammer@...hat.com>,
Jonathan Corbet <corbet@....net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Qemu Developers <qemu-devel@...gnu.org>,
KVM list <kvm@...r.kernel.org>,
Michal Hocko <mhocko@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Pavel Machek <pavel@....cz>,
Linux API <linux-api@...r.kernel.org>,
"mpe@...erman.id.au" <mpe@...erman.id.au>,
linux-s390 <linux-s390@...r.kernel.org>,
"areber@...hat.com" <areber@...hat.com>,
Pavel Emelyanov <ovzxemul@...il.com>,
Andrey Vagin <avagin@...il.com>,
Mike Rapoport <rppt@...nel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
"gil@...l.com" <gil@...l.com>,
"asmehra@...hat.com" <asmehra@...hat.com>,
"dgunigun@...hat.com" <dgunigun@...hat.com>,
"vijaysun@...ibm.com" <vijaysun@...ibm.com>
Subject: Re: [PATCH v2] drivers/virt: vmgenid: add vm generation id driver
On 18/11/2020 12:30, Alexander Graf wrote:
>
>
> On 16.11.20 16:34, Catangiu, Adrian Costin wrote:
>> - Future improvements
>>
>> Ideally we would want the driver to register itself based on devices'
>> _CID and not _HID, but unfortunately I couldn't find a way to do that.
>> The problem is that ACPI device matching is done by
>> '__acpi_match_device()' which exclusively looks at
>> 'acpi_hardware_id *hwid'.
>>
>> There is a path for platform devices to match on _CID when _HID is
>> 'PRP0001' - but this is not the case for the Qemu vmgenid device.
>>
>> Guidance and help here would be greatly appreciated.
>
> That one is pretty important IMHO. How about the following (probably
> pretty mangled) patch? That seems to work for me. The ACPI change
> would obviously need to be its own stand alone change and needs proper
> assessment whether it could possibly break any existing systems.
>
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index 1682f8b454a2..452443d79d87 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -748,7 +748,7 @@ static bool __acpi_match_device(struct acpi_device
> *device,
> /* First, check the ACPI/PNP IDs provided by the caller. */
> if (acpi_ids) {
> for (id = acpi_ids; id->id[0] || id->cls; id++) {
> - if (id->id[0] && !strcmp((char *)id->id, hwid->id))
> + if (id->id[0] && !strncmp((char *)id->id, hwid->id,
> ACPI_ID_LEN - 1))
> goto out_acpi_match;
> if (id->cls && __acpi_match_device_cls(id, hwid))
> goto out_acpi_match;
> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
> index 75a787da8aad..0bfa422cf094 100644
> --- a/drivers/virt/vmgenid.c
> +++ b/drivers/virt/vmgenid.c
> @@ -356,7 +356,8 @@ static void vmgenid_acpi_notify(struct acpi_device
> *device, u32 event)
> }
>
> static const struct acpi_device_id vmgenid_ids[] = {
> - {"QEMUVGID", 0},
> + /* This really is VM_Gen_Counter, but we can only match 8
> characters */
> + {"VM_GEN_C", 0},
> {"", 0},
> };
>
Looks legit. I can propose a patch with it, but how do we validate it
doesn't break any devices?
>> +2) ASYNC simplified example::
>> +
>> + void handle_io_on_vmgenfd(int vmgenfd)
>> + {
>> + unsigned genid;
>> +
>> + // because of VM generation change, we need to rebuild world
>> + reseed_app_env();
>> +
>> + // read new gen ID - we need it to confirm we've handled update
>> + read(fd, &genid, sizeof(genid));
>
> This is racy in case two consecutive snapshots happen. The read needs
> to go before the reseed.
>
Switched them around like you suggest to avoid confusion.
But I don't see a problem with this race. The idea here is to trigger
reseed_app_env() which doesn't depend on the generation counter value.
Whether it gets incremented once or N times is irrelevant, we're just
interested that we pause execution and reseed before resuming (in
between these, whether N or M generation changes is the same thing).
>> +3) Mapped memory polling simplified example::
>> +
>> + /*
>> + * app/library function that provides cached secrets
>> + */
>> + char * safe_cached_secret(app_data_t *app)
>> + {
>> + char *secret;
>> + volatile unsigned *const genid_ptr = get_vmgenid_mapping(app);
>> + again:
>> + secret = __cached_secret(app);
>> +
>> + if (unlikely(*genid_ptr != app->cached_genid)) {
>> + // rebuild world then confirm the genid update (thru write)
>> + rebuild_caches(app);
>> +
>> + app->cached_genid = *genid_ptr;
>
> This is racy again. You need to read the genid before rebuild and set
> it here.
>
I don't see the race. Gen counter is read from volatile mapped mem, on
detected change we rebuild world, confirm the update back to the driver
then restart the loop. Loop will break when no more changes happen.
>> + ack_vmgenid_update(app);
>> +
>> + goto again;
>> + }
>> +
>> + return secret;
>> + }
>> +
>> +
>> +static int vmgenid_close(struct inode *inode, struct file *file)
>> +{
>> + struct file_data *file_data = file->private_data;
>> + struct dev_data *priv = file_data->dev_data;
>> +
>> + if (file_data->acked_gen_counter != priv->generation_counter)
>> + vmgenid_put_outdated_watchers(priv);
>
> Is this racy? Could there be a snapshot notification coming between
> the branch and the put?
>
This is indeed racy, will fix it in patch v3.
>> + atomic_dec(&priv->watchers);
>> + kfree(file_data);
>> +
>> + return 0;
>> +}
>> +static ssize_t vmgenid_write(struct file *file, const char __user
>> *ubuf,
>> + size_t count, loff_t *ppos)
>> +{
>> + struct file_data *file_data = file->private_data;
>> + struct dev_data *priv = file_data->dev_data;
>> + unsigned int acked_gen_count;
>> +
>> + /* disallow partial writes */
>> + if (count != sizeof(acked_gen_count))
>> + return -EINVAL;
>> + if (copy_from_user(&acked_gen_count, ubuf, count))
>> + return -EFAULT;
>> + /* wrong gen-counter acknowledged */
>> + if (acked_gen_count != priv->generation_counter)
>> + return -EINVAL;
>> +
>> + if (file_data->acked_gen_counter != priv->generation_counter) {
>> + /* update local view of UUID */
>> + file_data->acked_gen_counter = acked_gen_count;
>> + vmgenid_put_outdated_watchers(priv);
>
> Same question here: What if there is a notification between the branch
> and the put?
>
Right, racy here as well. Will fix in patch v3.
Thanks,
Adrian.
Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Powered by blists - more mailing lists