lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3ff77f5c-b13b-49f5-98b0-a799453768d0@baylibre.com>
Date: Wed, 4 Jun 2025 08:21:51 -0500
From: David Lechner <dlechner@...libre.com>
To: Dharma.B@...rochip.com, kamel.bouhara@...tlin.com, wbg@...nel.org,
 Nicolas.Ferre@...rochip.com, alexandre.belloni@...tlin.com,
 claudiu.beznea@...on.dev
Cc: linux-arm-kernel@...ts.infradead.org, linux-iio@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] counter: microchip-tcb-capture: Add DMA support for
 TC_RAB register reads

On 6/4/25 1:15 AM, Dharma.B@...rochip.com wrote:
> On 29/05/25 9:03 pm, David Lechner wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> On 5/28/25 1:13 AM, Dharma Balasubiramani wrote:
>>> Add optional DMA-based data transfer support to read the TC_RAB register,
>>> which provides the next unread captured value from either RA or RB. This
>>> improves performance and offloads CPU when mchp,use-dma-cap is enabled in
>>> the device tree.
>>
>> It looks like this is using DMA to read a single register in the implementation
>> of a sysfs read. Do you have measurements to show the performance difference?
>> I find it hard to believe that this would actually make a significant difference
>> compared to the overhead of the read syscall to read the sysfs attribute.
>>
> Hi David,
> 
> Thanks for the feedback.
> 
> You're right — in our current testing setup, I didn't observe any 
> significant performance benefit from using DMA to read the TC_RAB 
> register via sysfs. I benchmarked both DMA-based and direct MMIO 
> register access using a userspace program generating high-frequency 
> capture events, and the overhead of the sysfs read path seems to 
> dominate in both cases.
> 
> Our initial motivation for using DMA was that the TCB IP in Microchip 
> SoCs includes optional DMA support specifically for capture value 
> transfers. I wanted to evaluate the potential benefit of offloading CPU 
> load when frequent capture events are occurring. However, in practice, 
> the complexity added (especially due to blocking behavior in atomic 
> contexts like watch) does not appear to be justified, at least via sysfs 
> or simple polling.
> 
> I also tried routing the DMA-based read through the 
> COUNTER_COMPONENT_EXTENSION watch path, but as you may expect, that 
> ended up hanging due to blocking behavior in non-sleepable contexts. So 
> that route seems unsuitable without a more complex asynchronous 
> buffering model.
> 
> Would you suggest exploring a different approach or a more appropriate 
> interface for DMA-based capture (e.g., via a dedicated ioctl or char 
> device with async support)? I’m happy to rework it if there's a suitable 
> context where DMA adds measurable value.
> 
> Thanks again for your review and time.
> 

Adding a feature just to make use of something a chip can do doesn't
seem like the wisest approach. Without know how people will actually
want to use it, we would only be guessing during the design of the
userspace interface. It would be better to wait until there is an
actual real-world use case and design something around that need.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ