linux-kernel - Re: [PATCH v3 4/8] iommufd: Add iommufd fault object

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240322170939.GJ66976@ziepe.ca>
Date: Fri, 22 Mar 2024 14:09:39 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Baolu Lu <baolu.lu@...ux.intel.com>
Cc: Kevin Tian <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
	Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
	Jean-Philippe Brucker <jean-philippe@...aro.org>,
	Nicolin Chen <nicolinc@...dia.com>, Yi Liu <yi.l.liu@...el.com>,
	Jacob Pan <jacob.jun.pan@...ux.intel.com>,
	Joel Granados <j.granados@...sung.com>, iommu@...ts.linux.dev,
	virtualization@...ts.linux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/8] iommufd: Add iommufd fault object

On Fri, Mar 15, 2024 at 09:46:06AM +0800, Baolu Lu wrote:
> On 3/9/24 2:03 AM, Jason Gunthorpe wrote:
> > On Mon, Jan 22, 2024 at 03:38:59PM +0800, Lu Baolu wrote:
> > > --- /dev/null
> > > +++ b/drivers/iommu/iommufd/fault.c
> > > @@ -0,0 +1,255 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/* Copyright (C) 2024 Intel Corporation
> > > + */
> > > +#define pr_fmt(fmt) "iommufd: " fmt
> > > +
> > > +#include <linux/file.h>
> > > +#include <linux/fs.h>
> > > +#include <linux/module.h>
> > > +#include <linux/mutex.h>
> > > +#include <linux/iommufd.h>
> > > +#include <linux/poll.h>
> > > +#include <linux/anon_inodes.h>
> > > +#include <uapi/linux/iommufd.h>
> > > +
> > > +#include "iommufd_private.h"
> > > +
> > > +static int device_add_fault(struct iopf_group *group)
> > > +{
> > > +	struct iommufd_device *idev = group->cookie->private;
> > > +	void *curr;
> > > +
> > > +	curr = xa_cmpxchg(&idev->faults, group->last_fault.fault.prm.grpid,
> > > +			  NULL, group, GFP_KERNEL);
> > > +
> > > +	return curr ? xa_err(curr) : 0;
> > > +}
> > > +
> > > +static void device_remove_fault(struct iopf_group *group)
> > > +{
> > > +	struct iommufd_device *idev = group->cookie->private;
> > > +
> > > +	xa_store(&idev->faults, group->last_fault.fault.prm.grpid,
> > > +		 NULL, GFP_KERNEL);
> > 
> > xa_erase ?
> 
> Yes. Sure.
> 
> > Is grpid OK to use this way? Doesn't it come from the originating
> > device?
> 
> The group ID is generated by the hardware. Here, we use it as an index
> in the fault array to ensure it can be quickly retrieved in the page
> fault response path.

I'm nervous about this, we are trusting HW outside the kernel to
provide unique grp id's which are integral to how the kernel
operates..

> > > +static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf,
> > > +				       size_t count, loff_t *ppos)
> > > +{
> > > +	size_t fault_size = sizeof(struct iommu_hwpt_pgfault);
> > > +	struct iommufd_fault *fault = filep->private_data;
> > > +	struct iommu_hwpt_pgfault data;
> > > +	struct iommufd_device *idev;
> > > +	struct iopf_group *group;
> > > +	struct iopf_fault *iopf;
> > > +	size_t done = 0;
> > > +	int rc;
> > > +
> > > +	if (*ppos || count % fault_size)
> > > +		return -ESPIPE;
> > > +
> > > +	mutex_lock(&fault->mutex);
> > > +	while (!list_empty(&fault->deliver) && count > done) {
> > > +		group = list_first_entry(&fault->deliver,
> > > +					 struct iopf_group, node);
> > > +
> > > +		if (list_count_nodes(&group->faults) * fault_size > count - done)
> > > +			break;
> > > +
> > > +		idev = (struct iommufd_device *)group->cookie->private;
> > > +		list_for_each_entry(iopf, &group->faults, list) {
> > > +			iommufd_compose_fault_message(&iopf->fault, &data, idev);
> > > +			rc = copy_to_user(buf + done, &data, fault_size);
> > > +			if (rc)
> > > +				goto err_unlock;
> > > +			done += fault_size;
> > > +		}
> > > +
> > > +		rc = device_add_fault(group);
> > 
> > See I wonder if this should be some xa_alloc or something instead of
> > trying to use the grpid?
> 
> So this magic number will be passed to user space in the fault message.
> And the user will then include this number in its response message. The
> response message is valid only when the magic number matches. Do I get
> you correctly?

Yes, then it is simple xa_alloc() and xa_load() without any other
searching and we don't have to rely on the grpid to be correctly
formed by the PCI device.

But I don't know about performance xa_alloc() is pretty fast but
trusting the grpid would be faster..

IMHO from a uapi perspective we should have a definate "cookie" that
gets echo'd back. If the kernel uses xa_alloc or grpid to build that
cookie it doesn't matter to the uAPI.

Jason