[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6671d95514e39e59dfda04a7a7ed1b83df001477.camel@gmail.com>
Date: Mon, 28 Oct 2024 12:08:49 +0100
From: Nuno Sá <noname.nuno@...il.com>
To: Jonathan Cameron <jic23@...nel.org>
Cc: David Lechner <dlechner@...libre.com>, Mark Brown <broonie@...nel.org>,
Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
Conor Dooley <conor+dt@...nel.org>, Nuno Sá
<nuno.sa@...log.com>, Uwe Kleine-König
<ukleinek@...nel.org>, Michael Hennerich <Michael.Hennerich@...log.com>,
Lars-Peter Clausen <lars@...afoo.de>, David Jander <david@...tonic.nl>,
Martin Sperl <kernel@...tin.sperl.org>, linux-spi@...r.kernel.org,
devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-iio@...r.kernel.org, linux-pwm@...r.kernel.org
Subject: Re: [PATCH RFC v4 11/15] iio: buffer-dmaengine: add
devm_iio_dmaengine_buffer_setup_ext2()
On Sat, 2024-10-26 at 16:48 +0100, Jonathan Cameron wrote:
> On Fri, 25 Oct 2024 20:40:42 +0200 (GMT+02:00)
> Nuno Sá <noname.nuno@...il.com> wrote:
>
> > Oct 25, 2024 18:42:02 David Lechner <dlechner@...libre.com>:
> >
> > > On 10/25/24 8:24 AM, Nuno Sá wrote:
> > > > I still need to look better at this but I do have one though already
> > > > :)
> > > >
> > > > On Wed, 2024-10-23 at 15:59 -0500, David Lechner wrote:
> > > > > Add a new devm_iio_dmaengine_buffer_setup_ext2() function to handle
> > > > > cases where the DMA channel is managed by the caller rather than
> > > > > being
> > > > > requested and released by the iio_dmaengine module.
> > > > >
> > > > > Signed-off-by: David Lechner <dlechner@...libre.com>
> > > > > ---
> > > > >
> > > > > v4 changes:
> > > > > * This replaces "iio: buffer-dmaengine: generalize requesting DMA
> > > > > channel"
> > > > > ---
> > >
> > > ...
> > >
> > > > > @@ -282,12 +281,38 @@ void iio_dmaengine_buffer_free(struct
> > > > > iio_buffer *buffer)
> > > > > iio_buffer_to_dmaengine_buffer(buffer);
> > > > >
> > > > > iio_dma_buffer_exit(&dmaengine_buffer->queue);
> > > > > - dma_release_channel(dmaengine_buffer->chan);
> > > > > -
> > > > > iio_buffer_put(buffer);
> > > > > +
> > > > > + if (dmaengine_buffer->owns_chan)
> > > > > + dma_release_channel(dmaengine_buffer->chan);
> > > >
> > > > Not sure if I agree much with this owns_chan flag. The way I see it,
> > > > we should always
> > > > handover the lifetime of the DMA channel to the IIO DMA framework.
> > > > Note that even the
> > > > device you pass in for both requesting the channel of the spi_offload
> > > > and for
> > > > setting up the DMA buffer is the same (and i suspect it will always
> > > > be) so I would
> > > > not go with the trouble. And with this assumption we could simplify a
> > > > bit more the
> > > > spi implementation.
> > >
> > > I tried something like this in v3 but Jonathan didn't seem to like it.
> > >
> > > https://lore.kernel.org/all/20240727144303.4a8604cb@jic23-huawei/
> > >
> > > >
> > > > And not even related but I even suspect the current implementation
> > > > could be
> > > > problematic. Basically I'm suspecting that the lifetime of the DMA
> > > > channel should be
> > > > attached to the lifetime of the iio_buffer. IOW, we should only
> > > > release the channel
> > > > in iio_dmaengine_buffer_release() - in which case the current
> > > > implementation with the
> > > > spi_offload would also be buggy.
> > >
> > > The buffer can outlive the iio device driver that created the buffer?
> >
> > Yes, it can as the IIO device itself. In case a userspace app has an open
> > FD for the buffer chardev, we get a reference that is only released when
> > the FD is closed (which can outlive the device behind bound to its
> > driver). That is why we nullify indio_dev->info and check for it on the
> > read() and write() fops.
> >
> > FWIW, I raised concerns about this in the past (as we don't have any lock
> > in those paths) but Jonathan rightfully wanted to see a real race. And I
> > was too lazy to try and reproduce one but I'm still fairly sure we have
> > theoretical (at least) races in those paths. And one of them could be (I
> > think) concurrently hitting a DMA submit block while the device is being
> > unbound. In that case the DMA chan would be already released and we could
> > still try to initiate a transfer. I did not check if that would crash or
> > something but it should still not happen.
> >
> There are a few places where I've been meaning to have another look
> at our protections during unregister. May well be problems hiding here
> and in general the thinking on how to do this in the kernel has slowly
> been changing so we might be able to clean things up in general.
>
Yeah, I'm fairly sure things like [1] are not enough in preventing potential nasty
races (though they should be hard to trigger). OTOH, in [2], we do have proper
locking.
Simple solution would be to use the info lock in the buffer read() and write() paths.
I do realize that's a fastpath but I don't think that would be such a contended lock.
But we can surely do better and RCU could be a good candidate for this (we could do
something similar to what gpiolib is doing) and I wouldn't expect it to be that
complicated to implement. Biggest issue by making info a __rcu pointer would be to
change all IIO drivers to set the pointer with rcu_assign_pointer(). Though during
probe there's no potential race so what we have today should be fine (just not sure
if things like sparse would not complain about the "raw" assignment).
[1]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-buffer.c#L176
[2]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-core.c#L1825
- Nuno Sá
Powered by blists - more mailing lists