lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <aG94uNDrL1MdHJPM@duo.ucw.cz>
Date: Thu, 10 Jul 2025 10:24:24 +0200
From: Pavel Machek <pavel@....cz>
To: kraxel@...hat.com, vivek.kasireddy@...el.com,
	dri-devel@...ts.freedesktop.org, sumit.semwal@...aro.org,
	benjamin.gaignard@...labora.com, Brian.Starkey@....com,
	jstultz@...gle.com, tjmercier@...gle.com,
	linux-media@...r.kernel.org, linaro-mm-sig@...ts.linaro.org,
	kernel list <linux-kernel@...r.kernel.org>,
	laurent.pinchart@...asonboard.com, l.stach@...gutronix.de,
	linux+etnaviv@...linux.org.uk, christian.gmeiner@...il.com,
	etnaviv@...ts.freedesktop.org, phone-devel@...r.kernel.org
Subject: DMA-BUFs always uncached on arm64, causing poor camera performance
 on Librem 5

Hi!

It seems that DMA-BUFs are always uncached on arm64... which is a
problem.

I'm trying to get useful camera support on Librem 5, and that includes
recording vidos (and taking photos).

memcpy() from normal memory is about 2msec/1MB. Unfortunately, for
DMA-BUFs it is 20msec/1MB, and that basically means I can't easily do
760p video recording. Plus, copying full-resolution photo buffer takes
more than 200msec!

There's possibility to do some processing on GPU, and its implemented here:

https://gitlab.com/tui/tui/-/tree/master/icam?ref_type=heads

but that hits the same problem in the end -- data is in DMA-BUF,
uncached, and takes way too long to copy out.

And that's ... wrong. DMA ended seconds ago, complete cache flush
would be way cheaper than copying single frame out, and I still have
to deal with uncached frames.

So I have two questions:

1) Is my analysis correct that, no matter how I get frame from v4l and
process it on GPU, I'll have to copy it from uncached memory in the
end?

2) Does anyone have patches / ideas / roadmap how to solve that? It
makes GPU unusable for computing, and camera basically unusable for
video.

Best regards,
								Pavel
-- 
I don't work for Nazis and criminals, and neither should you.
Boycott Putin, Trump, and Musk!

Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ