lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05381ccb-bdbb-4342-8a2f-11ee72454eda@linux.intel.com>
Date: Wed, 7 Aug 2024 11:09:41 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Niklas Schnelle <schnelle@...ux.ibm.com>, Joerg Roedel <joro@...tes.org>,
 Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
 Matthew Rosato <mjrosato@...ux.ibm.com>,
 Gerald Schaefer <gerald.schaefer@...ux.ibm.com>
Cc: baolu.lu@...ux.intel.com, Gerd Bayer <gbayer@...ux.ibm.com>,
 Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
 Alexander Gordeev <agordeev@...ux.ibm.com>,
 Sven Schnelle <svens@...ux.ibm.com>, Joerg Roedel <jroedel@...e.de>,
 iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
 linux-s390@...r.kernel.org
Subject: Re: [PATCH] iommu/s390: Implement blocking domain

On 2024/8/6 21:45, Niklas Schnelle wrote:
> This fixes a crash when surprise hot-unplugging a PCI device. This crash
> happens because during hot-unplug __iommu_group_set_domain_nofail()
> attaching the default domain fails when the platform no longer
> recognizes the device as it has already been removed and we end up with
> a NULL domain pointer and UAF. This is exactly the case referred to in
> the second comment in __iommu_device_set_domain() and just as stated
> there if we can instead attach the blocking domain the UAF is prevented
> as this can handle the already removed device. Implement the blocking
> domain to use this handling. This would still leave us with a warning
> for a failed attach. As failing to attach when the device is no longer
> present is expected behavior turn this into an explicit -ENODEV error
> and don't warn for it. Also change the error return for a NULL zdev to
> -EIO as we don't want to ignore this case that would be a serious bug.
> 
> Fixes: c76c067e488c ("s390/pci: Use dma-iommu layer")
> Signed-off-by: Niklas Schnelle<schnelle@...ux.ibm.com>
> ---
> Note: I somewhat suspect this to be related to the following discussion
> or at least we have seen the same backtraces in reports that we suspect
> to be caused by the issue fixed with this patch. In the case I was able
> to reproduce with vfio-pci pass-through to a KVM guest I got a different
> trace though.
> 
> Organizational note: I'll be on vacation starting Thursday. Matt will
> then take over and sent new revisions as necessary.
> ---
>   drivers/iommu/iommu.c      |  7 ++++--
>   drivers/iommu/s390-iommu.c | 55 ++++++++++++++++++++++++++++++++++++++--------
>   2 files changed, 51 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ed6c5cb60c5a..91b3b23bf55c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -119,8 +119,11 @@ static int __iommu_group_set_domain(struct iommu_group *group,
>   static void __iommu_group_set_domain_nofail(struct iommu_group *group,
>   					    struct iommu_domain *new_domain)
>   {
> -	WARN_ON(__iommu_group_set_domain_internal(
> -		group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> +	int ret = __iommu_group_set_domain_internal(
> +		group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED);
> +
> +	/* Allow attach to fail when the device is gone */
> +	WARN_ON(ret && ret != -ENODEV);

The return values of attach_dev have been documented in include/linux
/iommu.h:

/**
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
  *  Return:
  * * 0          - success
  * * EINVAL     - can indicate that device and domain are incompatible 
due to
  *                some previous configuration of the domain, in which 
case the
  *                driver shouldn't log an error, since it is legitimate 
for a
  *                caller to test reuse of existing domains. Otherwise, 
it may
  *                still represent some other fundamental problem
  * * ENOMEM     - out of memory
  * * ENOSPC     - non-ENOMEM type of resource allocation failures
  * * EBUSY      - device is attached to a domain and cannot be changed
  * * ENODEV     - device specific errors, not able to be attached
  * * <others>   - treated as ENODEV by the caller. Use is discouraged

ENODEV is obviously not suitable for the case of 'device is gone'.

Thanks,
baolu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ