linux-kernel - Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <x7qd7h7jmadf563u2wbfsxwp7s4isvkyqqpmw7t5e6qhccrwkc@yq24sfgl75pe>
Date:   Fri, 6 Oct 2023 15:56:25 +0530
From:   Jai Luthra <j-luthra@...com>
To:     Laurent Pinchart <laurent.pinchart@...asonboard.com>,
        Vinod Koul <vkoul@...nel.org>,
        Vignesh Raghavendra <vigneshr@...com>
CC:     Tomi Valkeinen <tomi.valkeinen@...asonboard.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Conor Dooley <conor+dt@...nel.org>,
        Sakari Ailus <sakari.ailus@...ux.intel.com>,
        <linux-media@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <devicetree@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
        Maxime Ripard <mripard@...nel.org>,
        <niklas.soderlund+renesas@...natech.se>,
        Benoit Parrot <bparrot@...com>,
        Vaishnav Achath <vaishnav.a@...com>, <nm@...com>,
        <devarsht@...com>, <a-bhatia1@...com>,
        Martyn Welch <martyn.welch@...labora.com>,
        Julien Massot <julien.massot@...labora.com>
Subject: Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E

Hi Laurent, Vignesh, Vinod,

I have some good news, there is an upper bound on the amount of data 
stored in the FIFOs (~32KB), so we don't need to allocate a buffer of 
the full frame size.

On Oct 04, 2023 at 23:03:12 +0300, Laurent Pinchart wrote:
> On Wed, Oct 04, 2023 at 07:21:00PM +0530, Vinod Koul wrote:
> > On 29-08-23, 18:55, Laurent Pinchart wrote:
> > > Hi Jai,
> > > 
> > > (CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
> > > question below)
> > 
> > Sorry this got lost
> 
> No worries.
> 
> > > On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> > > > On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > > > > On 11/08/2023 13:47, Jai Luthra wrote:
> > > > > > From: Pratyush Yadav <p.yadav@...com>
> > > 
> > > [snip]
> > > 
> > > > > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > > > > +{
> > > > > > +	struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > > > > +	struct ti_csi2rx_dma *dma = &csi->dma;
> > > > > > +	struct ti_csi2rx_buffer *buf;
> > > > > > +	unsigned long flags;
> > > > > > +	int ret = 0;
> > > > > > +
> > > > > > +	spin_lock_irqsave(&dma->lock, flags);
> > > > > > +	if (list_empty(&dma->queue))
> > > > > > +		ret = -EIO;
> > > > > > +	spin_unlock_irqrestore(&dma->lock, flags);
> > > > > > +	if (ret)
> > > > > > +		return ret;
> > > > > > +
> > > > > > +	dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > > > > +	dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > > > > +					      &dma->drain.paddr, GFP_KERNEL);
> > > > > > +	if (!dma->drain.vaddr)
> > > > > > +		return -ENOMEM;
> > > > > 
> > > > > This is still allocating a large buffer every time streaming is started (and
> > > > > with streams support, a separate buffer for each stream?).
> > > > > 
> > > > > Did you check if the TI DMA can do writes to a constant address? That would
> > > > > be the best option, as then the whole buffer allocation problem goes away.
> > > > 
> > > > I checked with Vignesh, the hardware can support a scenario where we 
> > > > flush out all the data without allocating a buffer, but I couldn't find 
> > > > a way to signal that via the current dmaengine framework APIs. Will look 
> > > > into it further as it will be important for multi-stream support.
> > > 
> > > That would be the best option. It's not immediately apparent to me if
> > > the DMA engine API supports such a use case.
> > > dmaengine_prep_interleaved_dma() gives you finer grain control on the
> > > source and destination increments, but I haven't seen a way to instruct
> > > the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
> > > this something that is supported, or could be supported ?
> > 
> > Write to a dummy buffer could have the same behaviour, no?
> 
> Yes, but if the DMA engine can write to /dev/null, that avoids
> allocating a dummy buffer, which is nicer. For video use cases, dummy
> buffers are often large.
> 
> > > > > Alternatively, can you flush the buffers with multiple one line transfers?
> > > > > The flushing shouldn't be performance critical, so even if that's slower
> > > > > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > > > > done, a single probe time line-buffer allocation should do the trick.
> > > > 
> > > > There will be considerable overhead if we queue many DMA transactions 
> > > > (in the order of 1000s or even 100s), which might not be okay for the 
> > > > scenarios where we have to drain mid-stream. Will have to run some 
> > > > experiments to see if that is worth it.
> > > > 
> > > > But one optimization we can for sure do is re-use a single drain buffer 
> > > > for all the streams. We will need to ensure to re-allocate the buffer 
> > > > for the "largest" framesize supported across the different streams at 
> > > > stream-on time.
> > > 
> > > If you implement .device_prep_interleaved_dma() in the DMA engine driver
> > > you could write to a single line buffer, assuming that the hardware would
> > > support so in a generic way.
> > > 
> > > > My guess is the endpoint is not buffering a full-frame's worth of data, 
> > > > I will also check if we can upper bound that size to something feasible.

According to the spec the endpoint buffers a maximum of 2048 x (128-bit) 
samples, which comes out to be 32KiB.

I ran some experiments after disabling the drain and looking at the 
subsequent corrupt frames with stale data, and it was always in 
multiples of (< 20x) 128-bit samples.

Given we have an upper bound, I think a practical solution for now is to 
allocate a single re-usable 32KiB buffer at probe time (will send v10 
with this fix).

Although it would be ideal if we can do this without *any* buffers at 
all.

> > > > 
> > > > > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > > > > to give Rb, but I do encourage you to look more into optimizing this drain
> > > > > buffer.
> > > > 
> > > > Thank you!
> > > > 
> > > > > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Thanks,
Jai

GPG Fingerprint: 4DE0 D818 E5D5 75E8 D45A AFC5 43DE 91F9 249A 7145

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)