[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAfi2SeDqD8IybOJ@f39>
Date: Tue, 22 Apr 2025 20:41:29 +0200
From: Eder Zulian <ezulian@...hat.com>
To: Nathan Lynch <nathan.lynch@....com>
Cc: Basavaraj.Natikar@....com, vkoul@...nel.org, dmaengine@...r.kernel.org,
linux-kernel@...r.kernel.org, jsnitsel@...hat.com,
ddutile@...hat.com
Subject: Re: [PATCH RFC 1/1] dmaengine: ptdma: use SLAB_TYPESAFE_BY_RCU for
the DMA descriptor slab
Hello Nathan,
On Thu, Apr 17, 2025 at 04:02:23PM -0500, Nathan Lynch wrote:
> Eder Zulian <ezulian@...hat.com> writes:
> > The SLAB_TYPESAFE_BY_RCU flag prevents a change of type for objects
> > allocated from the slab cache (although the memory may be reallocated to
> > a completetly different object of the same type.) Moreover, when the
> > last reference to an object is dropped the finalization code must not
> > run until all __rcu pointers referencing the object have been updated,
> > and then a grace period has passed.
> >
> > Signed-off-by: Eder Zulian <ezulian@...hat.com>
> > ---
> > drivers/dma/amd/ptdma/ptdma-dmaengine.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> > index 715ac3ae067b..b70dd1b0b9fb 100644
> > --- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> > +++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> > @@ -597,7 +597,8 @@ int pt_dmaengine_register(struct pt_device *pt)
> >
> > pt->dma_desc_cache = kmem_cache_create(desc_cache_name,
> > sizeof(struct pt_dma_desc), 0,
> > - SLAB_HWCACHE_ALIGN, NULL);
> > + SLAB_HWCACHE_ALIGN |
> > + SLAB_TYPESAFE_BY_RCU, NULL);
>
> No, this code wasn't written to exploit SLAB_TYPESAFE_BY_RCU and this
> change can only obscure the problem. There's likely a data race in the
> driver.
>
Ack. Let's conclude my RFC and discard the proposed patch then.
Thank you very much for your feedback.
> I suspect pt_cmd_callback_work() has a bug:
>
> spin_lock_irqsave(&chan->vc.lock, flags);
> if (desc) {
> if (desc->status != DMA_COMPLETE) {
> if (desc->status != DMA_ERROR)
> desc->status = DMA_COMPLETE;
>
> dma_cookie_complete(tx_desc);
> dma_descriptor_unmap(tx_desc);
> } else {
> tx_desc = NULL;
> }
> }
> spin_unlock_irqrestore(&chan->vc.lock, flags);
>
> if (tx_desc) {
> dmaengine_desc_get_callback_invoke(tx_desc, NULL);
> dma_run_dependencies(tx_desc);
> >>>> list_del(&desc->vd.node); <<< must be done under vc.lock
> vchan_vdesc_fini(vd);
> }
>
> But that's relatively new code that may not be in the kernel you're
> running.
>
True. pt_cmd_callback_work() wasn't in the kernel used for tests and it
seems to be used only if 'pt->ver == AE4_DMA_VERSION'. In that kernel
pt_cmd_callback() would call pt_handle_active_desc() which seemed to have
the same bug.
Eder
Powered by blists - more mailing lists