linux-kernel - Re: [PATCH 4.4 095/101] iommu: Handle default domain attach failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1499280779.1935.96.camel@codethink.co.uk>
Date:   Wed, 05 Jul 2017 19:52:59 +0100
From:   Ben Hutchings <ben.hutchings@...ethink.co.uk>
To:     Robin Murphy <robin.murphy@....com>, Joerg Roedel <jroedel@...e.de>
Cc:     linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Punit Agrawal <punit.agrawal@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH 4.4 095/101] iommu: Handle default domain attach failure

On Mon, 2017-07-03 at 15:35 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Robin Murphy <robin.murphy@....com>
> 
> commit 797a8b4d768c58caac58ee3e8cb36a164d1b7751 upstream.
> 
> We wouldn't normally expect ops->attach_dev() to fail, but on IOMMUs
> with limited hardware resources, or generally misconfigured systems,
> it is certainly possible. We report failure correctly from the external
> iommu_attach_device() interface, but do not do so in iommu_group_add()
> when attaching to the default domain. The result of failure there is
> that the device, group and domain all get left in a broken,
> part-configured state which leads to weird errors and misbehaviour down
> the line when IOMMU API calls sort-of-but-don't-quite work.
> 
> Check the return value of __iommu_attach_device() on the default domain,
> and refactor the error handling paths to cope with its failure and clean
> up correctly in such cases.
[...]
> @@ -432,8 +426,10 @@ rename:
>  	mutex_lock(&group->mutex);
>  	list_add_tail(&device->list, &group->devices);
>  	if (group->domain)
> -		__iommu_attach_device(group->domain, dev);
> +		ret = __iommu_attach_device(group->domain, dev);
>  	mutex_unlock(&group->mutex);
> +	if (ret)
> +		goto err_put_group;

It's still (briefly) possible for other tasks to observe the device in
the broken state.  Shouldn't the error check be done before
mutex_unlock()?
 
>  	/* Notify any listeners about change to group. */
>  	blocking_notifier_call_chain(&group->notifier,
> @@ -444,6 +440,21 @@ rename:
>  	pr_info("Adding device %s to group %d\n", dev_name(dev), group->id);
>  
>  	return 0;
> +
> +err_put_group:
> +	mutex_lock(&group->mutex);
> +	list_del(&device->list);
> +	mutex_unlock(&group->mutex);
> +	dev->iommu_group = NULL;
> +	kobject_put(group->devices_kobj);
> +err_free_name:
> +	kfree(device->name);
> +err_remove_link:
> +	sysfs_remove_link(&dev->kobj, "iommu_group");
> +err_free_device:
> +	kfree(device);
> +	pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_add_device);
>  

It seems like this cleanup statement in iommu_group_remove_device()
should also be done here under err_put_group:

	sysfs_remove_link(group->devices_kobj, device->name);

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.