[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200713092445.GH34333@vkoul-mobl>
Date: Mon, 13 Jul 2020 14:54:45 +0530
From: Vinod Koul <vkoul@...nel.org>
To: Dave Jiang <dave.jiang@...el.com>
Cc: Jiri Slaby <jirislaby@...nel.org>,
Swathi Kovvuri <swathi.kovvuri@...el.com>,
peter.ujfalusi@...com,
Linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] dmaengine: check device and channel list for empty
On 09-07-20, 08:23, Dave Jiang wrote:
>
>
> On 7/8/2020 10:35 PM, Jiri Slaby wrote:
> > On 07. 07. 20, 17:42, Dave Jiang wrote:
> > > On 7/6/2020 11:05 PM, Jiri Slaby wrote:
> > > > On 26. 06. 20, 20:09, Dave Jiang wrote:
> > > > > Check dma device list and channel list for empty before iterate as the
> > > > > iteration function assume the list to be not empty. With devices and
> > > > > channels now being hot pluggable this is a condition that needs to be
> > > > > checked. Otherwise it can cause the iterator to spin forever.
> > > >
> > > > Could you be a little bit more specific how this can spin forever? I.e.
> > > > can you attach a stacktrace of such a behaviour?
> > >
> > > I can't seem to find the original splat that lead me to the conclusion
> > > of it's spinning forever. As I recall, the issue seems to produce
> > > different splats and not always consistent in being reproduced. Here's a
> > > partial splat that was tracked by the internal bug database. Since with
> > > the dma device and channel list being are hot added and removed, the
> > > device and channel lists can be empty. The list_entry() and friends
> > > expect the list to not be empty (according to header comment), I added
> > > the check to ensure that isn't the case before using them in dmaengine.
> >
> > Yes, the comment states that as it is true: you receive a
> > wild/non-checkable pointer if you do list_entry on an empty list. BUT
> > have you actually read what I wrote:
> >
> > > > As in the empty case, "&pos->member" is "head" (look into
> > > > list_for_each_entry) and the for loop should loop exactly zero times.
> >
> > HERE ^^^^
> >
> > > With the fix, we can no longer produce any of the splats. So maybe the
> > > above was a bad description of the issue.
> >
> > No, not only the description, worse, the patch proper looks wrong.
> >
> > > [ 4216.048375] ? dma_channel_rebalance+0x7b/0x250
> > > [ 4216.056360] dma_async_device_register+0x349/0x3a0
> > > [ 4216.064604] idxd_register_dma_device+0x90/0xc0 [idxd]
> > > [ 4216.073175] idxd_config_bus_probe.cold+0x7d/0x1fc [idxd]
> >
> > So, the good part in the patch is the fixed locking in
> > dma_async_device_register. Otherwise it adds nonsense checks. So you
> > fixed the issue only by a chance, by a side effect as Peter pointed out.
> > Leaving aside that you broke dma_request_chan -- that could happen to
> > anybody.
> >
> > Vinod, please drop/revert this patch. Then start over only with
> > dma_async_device_register fixed locking.
>
> I'll start on the proper fix.
Dropped
--
~Vinod
Powered by blists - more mailing lists