lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Oct 2021 15:45:55 +0200
From:   Pierre Morel <pmorel@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>,
        Vineeth Vijayan <vneethv@...ux.ibm.com>,
        Peter Oberparleiter <oberpar@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Michael Mueller <mimu@...ux.ibm.com>,
        Cornelia Huck <cohuck@...hat.com>, linux-s390@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc:     stable@...r.kernel.org, bfu@...hat.com
Subject: Re: [RFC PATCH 1/1] s390/cio: make ccw_device_dma_* more robust



On 10/11/21 1:59 PM, Halil Pasic wrote:
> Since commit 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and
> classic notifiers") we were supposed to make sure that
> virtio_ccw_release_dev() completes before the ccw device and the
> attached dma pool are torn down, but unfortunately we did not.  Before
> that commit it used to be OK to delay cleaning up the memory allocated
> by virtio-ccw indefinitely (which isn't really intuitive for guys used
> to destruction happens in reverse construction order), but now we
> trigger a BUG_ON if the genpool is destroyed before all memory allocated
> form it. Which brings down the guest. We can observe this problem, when
> unregister_virtio_device() does not give up the last reference to the
> virtio_device (e.g. because a virtio-scsi attached scsi disk got removed
> without previously unmounting its previously mounted  partition).
> 
> To make sure that the genpool is only destroyed after all the necessary
> freeing is done let us take a reference on the ccw device on each
> ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free().
> 
> Actually there are multiple approaches to fixing the problem at hand
> that can work. The upside of this one is that it is the safest one while
> remaining simple. We don't crash the guest even if the driver does not
> pair allocations and frees. The downside is the reference counting
> overhead, that the reference counting for ccw devices becomes more
> complex, in a sense that we need to pair the calls to the aforementioned
> functions for it to be correct, and that if we happen to leak, we leak
> more than necessary (the whole ccw device instead of just the genpool).
> 
> Some alternatives to this approach are taking a reference in
> virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or
> making sure virtio_ccw_release_dev() completes its work before
> virtio_ccw_remove() returns. The downside of these approaches is that
> these are less safe against programming errors.
> 
> Cc: <stable@...r.kernel.org> # v5.3
> Signed-off-by: Halil Pasic <pasic@...ux.ibm.com>
> Fixes: 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and
> classic notifiers")
> Reported-by: bfu@...hat.com
> 
> ---
> 
> FYI I've proposed a different fix to this very same problem:
> https://lore.kernel.org/lkml/20210915215742.1793314-1-pasic@linux.ibm.com/
> 
> This patch is more or less a result of that discussion.
> ---
>   drivers/s390/cio/device_ops.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
> index 0fe7b2f2e7f5..c533d1dadc6b 100644
> --- a/drivers/s390/cio/device_ops.c
> +++ b/drivers/s390/cio/device_ops.c
> @@ -825,13 +825,23 @@ EXPORT_SYMBOL_GPL(ccw_device_get_chid);
>    */
>   void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
>   {
> -	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
> +	void *addr;
> +
> +	if (!get_device(&cdev->dev))
> +		return NULL;
> +	addr = cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
> +	if (IS_ERR_OR_NULL(addr))

I can be wrong but it seems that only dma_alloc_coherent() used in 
cio_gp_dma_zalloc() report an error but the error is ignored and used as 
a valid pointer.

So shouldn't we modify this function and just test for a NULL address here?

here what I mean:---------------------------------

diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 2bc55ccf3f23..b45fbaa7131b 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1176,7 +1176,7 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, 
struct device *dma_dev,
                 chunk_size = round_up(size, PAGE_SIZE);
                 addr = (unsigned long) dma_alloc_coherent(dma_dev,
                                          chunk_size, &dma_addr, 
CIO_DMA_GFP);
-               if (!addr)
+               if (IS_ERR_OR_NULL(addr))
                         return NULL;
                 gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
                 addr = gen_pool_alloc(gp_dma, size);

---------------------------------

> +		put_device(&cdev->dev);

addr is not null if addr is ERR.

> +	return addr;

may be return IS_ERR_OR_NULL(addr)? NULL : addr;

>   }
>   EXPORT_SYMBOL(ccw_device_dma_zalloc);
>   
>   void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
>   {
> +	if (!cpu_addr)
> +		return;

no need, cpu_addr is already tested in cio_gp_dma_free()

>   	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
> +	put_device(&cdev->dev);
>   }
>   EXPORT_SYMBOL(ccw_device_dma_free);
>   
> 
> base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc
> 

-- 
Pierre Morel
IBM Lab Boeblingen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ