linux-kernel - Re: [PATCH 3/5] dmaengine: plx-dma: Introduce PLX DMA engine PCI driver skeleton

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191112060919.GZ952516@vkoul-mobl>
Date:   Tue, 12 Nov 2019 11:39:19 +0530
From:   Vinod Koul <vkoul@...nel.org>
To:     Logan Gunthorpe <logang@...tatee.com>
Cc:     linux-kernel@...r.kernel.org, dmaengine@...r.kernel.org,
        Dan Williams <dan.j.williams@...el.com>
Subject: Re: [PATCH 3/5] dmaengine: plx-dma: Introduce PLX DMA engine PCI
 driver skeleton

On 11-11-19, 10:50, Logan Gunthorpe wrote:
> 
> 
> On 2019-11-09 10:35 a.m., Vinod Koul wrote:
> > On 22-10-19, 15:46, Logan Gunthorpe wrote:
> >> +static irqreturn_t plx_dma_isr(int irq, void *devid)
> >> +{
> >> +	return IRQ_HANDLED;
> > 
> > ??
> 
> Yes, sorry this is more of an artifact of how I chose to split the
> patches up. The ISR is filled-in in patch 4.

lets move this code in all including isr registration in patch 4 then :)

> >> +	 */
> >> +	schedule_work(&plxdev->release_work);
> >> +}
> >> +
> >> +static void plx_dma_put(struct plx_dma_dev *plxdev)
> >> +{
> >> +	kref_put(&plxdev->ref, plx_dma_release);
> >> +}
> >> +
> >> +static int plx_dma_alloc_chan_resources(struct dma_chan *chan)
> >> +{
> >> +	struct plx_dma_dev *plxdev = chan_to_plx_dma_dev(chan);
> >> +
> >> +	kref_get(&plxdev->ref);
> > 
> > why do you need to do this?
> 
> This has to do with being able to probably unbind while a channel is in
> use. If we don't hold a reference to the struct plx_dma_dev between
> alloc_chan_resources() and free_chan_resources() then it will panic if a
> call back is called after plx_dma_remove(). The way I've done it, once a

which callback?

> device is removed, subsequent calls to dma_prep_memcpy() will fail (see
> ring_active).
> 
> struct plx_dma_dev needs to be alive between plx_dma_probe() and
> plx_dma_remove(), and between calls to alloc_chan_resources() and
> free_chan_resources(). So we use a reference count to ensure this.

and that is why we hold module reference so we don't go away without
cleanup

> >> +static void plx_dma_release_work(struct work_struct *work)
> >> +{
> >> +	struct plx_dma_dev *plxdev = container_of(work, struct plx_dma_dev,
> >> +						  release_work);
> >> +
> >> +	dma_async_device_unregister(&plxdev->dma_dev);
> >> +	put_device(plxdev->dma_dev.dev);
> >> +	kfree(plxdev);
> >> +}
> >> +
> >> +static void plx_dma_release(struct kref *ref)
> >> +{
> >> +	struct plx_dma_dev *plxdev = container_of(ref, struct plx_dma_dev, ref);
> >> +
> >> +	/*
> >> +	 * The dmaengine reference counting and locking is a bit of a
> >> +	 * mess so we have to work around it a bit here. We might put
> >> +	 * the reference while the dmaengine holds the dma_list_mutex
> >> +	 * which means we can't call dma_async_device_unregister() directly
> >> +	 * here and it must be delayed.
> >
> > why is that, i have not heard any complaints about locking, can you
> > elaborate on why you need to do this?
> 
> Per the above explanation, we need to call plx_dma_put() in
> plx_dma_free_chan_resources(); and plx_dma_release() is when we can call
> dma_async_device_unregister() (seeing that's when we know there are no
> longer any active channels).
> 
> However, dma_chan_put() (which calls device_free_chan_resources()) holds
> the dma_list_mutex and dma_async_device_unregister() tries to take the
> dma_list_mutex so, if we call unregister inside free_chan_resources we
> would deadlock.

yes as we are not expecting someone to unregister in
device_free_chan_resources(), that is for freeing up resources.

You are expected to unregister in .remove!

Can you explain me why unregister cant be done in remove? I think I am
still missing some detail for this case.

-- 
~Vinod