[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <90813c19-5fef-4c29-9387-6c9e2770a549@linux.intel.com>
Date: Mon, 25 Mar 2024 13:01:32 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: baolu.lu@...ux.intel.com, Kevin Tian <kevin.tian@...el.com>,
Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Robin Murphy <robin.murphy@....com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Nicolin Chen <nicolinc@...dia.com>, Yi Liu <yi.l.liu@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
Joel Granados <j.granados@...sung.com>, iommu@...ts.linux.dev,
virtualization@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/8] iommufd: Add iommufd fault object
On 2024/3/23 1:09, Jason Gunthorpe wrote:
> On Fri, Mar 15, 2024 at 09:46:06AM +0800, Baolu Lu wrote:
>> On 3/9/24 2:03 AM, Jason Gunthorpe wrote:
>>> On Mon, Jan 22, 2024 at 03:38:59PM +0800, Lu Baolu wrote:
>>>> --- /dev/null
>>>> +++ b/drivers/iommu/iommufd/fault.c
>>>> @@ -0,0 +1,255 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +/* Copyright (C) 2024 Intel Corporation
>>>> + */
>>>> +#define pr_fmt(fmt) "iommufd: " fmt
>>>> +
>>>> +#include <linux/file.h>
>>>> +#include <linux/fs.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/mutex.h>
>>>> +#include <linux/iommufd.h>
>>>> +#include <linux/poll.h>
>>>> +#include <linux/anon_inodes.h>
>>>> +#include <uapi/linux/iommufd.h>
>>>> +
>>>> +#include "iommufd_private.h"
>>>> +
>>>> +static int device_add_fault(struct iopf_group *group)
>>>> +{
>>>> + struct iommufd_device *idev = group->cookie->private;
>>>> + void *curr;
>>>> +
>>>> + curr = xa_cmpxchg(&idev->faults, group->last_fault.fault.prm.grpid,
>>>> + NULL, group, GFP_KERNEL);
>>>> +
>>>> + return curr ? xa_err(curr) : 0;
>>>> +}
>>>> +
>>>> +static void device_remove_fault(struct iopf_group *group)
>>>> +{
>>>> + struct iommufd_device *idev = group->cookie->private;
>>>> +
>>>> + xa_store(&idev->faults, group->last_fault.fault.prm.grpid,
>>>> + NULL, GFP_KERNEL);
>>>
>>> xa_erase ?
>>
>> Yes. Sure.
>>
>>> Is grpid OK to use this way? Doesn't it come from the originating
>>> device?
>>
>> The group ID is generated by the hardware. Here, we use it as an index
>> in the fault array to ensure it can be quickly retrieved in the page
>> fault response path.
>
> I'm nervous about this, we are trusting HW outside the kernel to
> provide unique grp id's which are integral to how the kernel
> operates..
Agreed.
>
>>>> +static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf,
>>>> + size_t count, loff_t *ppos)
>>>> +{
>>>> + size_t fault_size = sizeof(struct iommu_hwpt_pgfault);
>>>> + struct iommufd_fault *fault = filep->private_data;
>>>> + struct iommu_hwpt_pgfault data;
>>>> + struct iommufd_device *idev;
>>>> + struct iopf_group *group;
>>>> + struct iopf_fault *iopf;
>>>> + size_t done = 0;
>>>> + int rc;
>>>> +
>>>> + if (*ppos || count % fault_size)
>>>> + return -ESPIPE;
>>>> +
>>>> + mutex_lock(&fault->mutex);
>>>> + while (!list_empty(&fault->deliver) && count > done) {
>>>> + group = list_first_entry(&fault->deliver,
>>>> + struct iopf_group, node);
>>>> +
>>>> + if (list_count_nodes(&group->faults) * fault_size > count - done)
>>>> + break;
>>>> +
>>>> + idev = (struct iommufd_device *)group->cookie->private;
>>>> + list_for_each_entry(iopf, &group->faults, list) {
>>>> + iommufd_compose_fault_message(&iopf->fault, &data, idev);
>>>> + rc = copy_to_user(buf + done, &data, fault_size);
>>>> + if (rc)
>>>> + goto err_unlock;
>>>> + done += fault_size;
>>>> + }
>>>> +
>>>> + rc = device_add_fault(group);
>>>
>>> See I wonder if this should be some xa_alloc or something instead of
>>> trying to use the grpid?
>>
>> So this magic number will be passed to user space in the fault message.
>> And the user will then include this number in its response message. The
>> response message is valid only when the magic number matches. Do I get
>> you correctly?
>
> Yes, then it is simple xa_alloc() and xa_load() without any other
> searching and we don't have to rely on the grpid to be correctly
> formed by the PCI device.
>
> But I don't know about performance xa_alloc() is pretty fast but
> trusting the grpid would be faster..
>
> IMHO from a uapi perspective we should have a definate "cookie" that
> gets echo'd back. If the kernel uses xa_alloc or grpid to build that
> cookie it doesn't matter to the uAPI.
Okay, I will head in this direction.
Best regards,
baolu
Powered by blists - more mailing lists