lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9cea1e05-e24a-47c0-b974-d9e58e2e3cf8@amlogic.com>
Date: Tue, 22 Jul 2025 10:32:53 +0800
From: Chuan Liu <chuan.liu@...ogic.com>
To: Martin Blumenstingl <martin.blumenstingl@...glemail.com>
Cc: Neil Armstrong <neil.armstrong@...aro.org>,
 Kevin Hilman <khilman@...libre.com>, Jerome Brunet <jbrunet@...libre.com>,
 linux-arm-kernel@...ts.infradead.org, linux-amlogic@...ts.infradead.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] soc: amlogic: clk-measure: Optimize measurement accuracy


On 7/22/2025 4:16 AM, Martin Blumenstingl wrote:
> [ EXTERNAL EMAIL ]
>
> On Fri, Jul 18, 2025 at 8:14 AM Chuan Liu <chuan.liu@...ogic.com> wrote:
>> hi Marti:
>>
>>
>> On 7/17/2025 11:43 PM, Martin Blumenstingl wrote:
>>> [ EXTERNAL EMAIL ]
>>>
>>> Hello,
>>>
>>> thank you for this patch!
>>>
>>> On Thu, Jul 17, 2025 at 5:08 AM Chuan Liu via B4 Relay
>>> <devnull+chuan.liu.amlogic.com@...nel.org> wrote:
>>>> From: Chuan Liu <chuan.liu@...ogic.com>
>>>>
>>>> The cycle count register has a 20-bit effective width, but the driver
>>>> only utilizes 16 bits. This reduces the sampling window when measuring
>>>> high-frequency clocks, resulting in (slightly) degraded measurement
>>>> accuracy.
>>> I checked the Meson8 downstream code [0] and it uses 0x000FFFFF to
>>> mask the register value -> this means that old SoCs also have a 20-bit
>>> wide width.
>>>
>>> [...]
>>>> Here are the test comparisons based on C3:
>>> [...]
>>>> Here are the test comparisons based on C3:
>>> I have tested this patch with Meson8b based Odroid-C1:
>>> pre-optimization:
>>> # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v "  0  "
>>>     clock                     rate    precision
>>> ---------------------------------------------
>>>    clk81                 159372396    +/-5208Hz
>>>    a9_clk_div16           24000000    +/-3125Hz
>>>    rtc_osc_clk_out           31250    +/-3125Hz
>>>    hdmi_ch0_tmds         146399038    +/-4807Hz
>>>    sar_adc                 1140625    +/-3125Hz
>>>    sdhc_rx                94443750    +/-3125Hz
>>>    sdhc_sd                94443750    +/-3125Hz
>>>    pwm_d                 849921875    +/-31250Hz
>>>    pwm_c                 849921875    +/-31250Hz
>>>
>>> real    0m0.102s
>>> user    0m0.005s
>>> sys     0m0.069s
>>>
>>>
>>> post-optimization:
>>> # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v "  0  "
>>>     clock                     rate    precision
>>> ---------------------------------------------
>>>    clk81                 159373438    +/-1562Hz
>>>    a9_clk_div16           12000000    +/-1562Hz
>>>    rtc_osc_clk_out           32813    +/-1562Hz
>>>    hdmi_ch0_tmds         146398438    +/-1562Hz
>>>    sar_adc                 1143750    +/-1562Hz
>>>    sdhc_rx                94443750    +/-1562Hz
>>>    sdhc_sd                94443750    +/-1562Hz
>>>    pwm_d                 849992188    +/-1562Hz
>>>    pwm_c                 849992188    +/-1562Hz
>>>
>>> real    0m0.173s
>>> user    0m0.008s
>>> sys     0m0.109s
>>>
>>> So there's also an improvement in accuracy. The only downside I'm
>>> seeing is that it takes 75% extra time for the measurement. For me
>>> this is irrelevant since we use this for debugging.
>>>
>>> [...]
>>>> +       /*
>>>> +        * HACK: The input clock signal path from gate (Controlled by MSR_RUN)
>>>> +        * to internal sampling circuit in clk-measure has a propagation delay
>>>> +        * requirement: 24 clock cycles must elapse after mux selection before
>>>> +        * sampling.
>>>> +        *
>>>> +        * For a 30kHz measurement clock, this translates to an 800μs delay:
>>>> +        * 800us = 24 / 30000Hz.
>>>> +        */
>>>> +       fsleep(800);
>>> What is needed to make this not a HACK anymore? Is there a register
>>> that we can poll for the number of clock cycles that have passed?
>>
>> The required delay duration is frequency-dependent on the measurement
>> clock source. The current 800μs delay is calculated based on a
>> minimum input clock frequency of 30kHz. At higher input frequencies,
>> this delay could be proportionally reduced. Applying a uniform 800μs
>> delay therefore appears overly conservative for general use cases.
>>
>>
>> The IP currently lacks a status register to detect whether the input
>> clock signal has successfully propagated to the internal measurement
>> circuitry.
>>
>>
>> The design of this IP has been maintained for many years. From a
>> hardware design perspective, there is room for optimization in this
>> signal propagation delay. Future IP updates may not require such a
>> long delay.
> Thanks for the detailed description. To me this doesn't seem like a
> "hack" then, it's just something that's needed to interface with the
> hardware correctly.
> My suggestion is to replace the word "HACK" with "NOTE".
>
> What do you think?


OK, thanks for your suggestion. I'll replace it with “NOTE” in the next 
version.


>
> Best regards,
> Martin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ