lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 25 Jun 2022 12:43:54 +0200
From:   Jernej Škrabec <jernej.skrabec@...il.com>
To:     peron.clem@...il.com, Roman Stratiienko <r.stratiienko@...il.com>
Cc:     mturquette@...libre.com, sboyd@...nel.org, mripard@...nel.org,
        wens@...e.org, linux-clk@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-sunxi@...ts.linux.dev,
        linux-kernel@...r.kernel.org,
        Roman Stratiienko <r.stratiienko@...il.com>
Subject: Re: [PATCH] clk: sunxi-ng: sun50i: h6: Modify GPU clock configuration to support DFS

Hi Roman,

Dne petek, 24. junij 2022 ob 18:52:11 CEST je Roman Stratiienko napisal(a):
> Using simple bash script it was discovered that not all CCU registers
> can be safely used for DFS, e.g.:
> 
>     while true
>     do
>         devmem 0x3001030 4 0xb0003e02
>         devmem 0x3001030 4 0xb0001e02
>     done
> 
> Script above changes the GPU_PLL multiplier register value. While the
> script is running, the user should interact with the user interface.
> 
> Using this method the following results were obtained:
> | Register  | Name           | Bits  | Values | Result |
> | --        | --             | --    | --     | --     |
> | 0x3001030 | GPU_PLL.MULT   | 15..8 | 20-62  | OK     |
> | 0x3001030 | GPU_PLL.INDIV  |     1 | 0-1    | OK     |
> | 0x3001030 | GPU_PLL.OUTDIV |     0 | 0-1    | FAIL   |
> | 0x3001670 | GPU_CLK.DIV    |  3..0 | ANY    | FAIL   |
> 
> Once bits that caused system failure disabled (kept default 0),
> it was discovered that GPU_CLK.MUX was used during DFS for some
> reason and was causing the failure too.
> 
> After disabling GPU_PLL.OUTDIV the system started to fail during
> booting for some reason until the maximum frequency of GPU_PLL
> clock was limited to 756MHz.
> 
> After all the changes made DVFS started to work seamlessly.

I appreciate testing effort, but I don't think userspace approach is good way 
for testing DVFS. I see 2 issues:
- As name already suggest, voltage also plays crucial role for stability. You 
didn't say on which board you tested this, but I assume it has PMIC. Did you 
make sure GPU voltage regulator is always at 1.04 V, which is needed for 756 
MHz?
- Kernel clock driver always goes through proper procedure for clock rate 
change, which involves several steps. Bypassing them might also cause some 
stability problems.

I agree that GPU PLL should be limited to 756 MHz max. This seems to be 
maximum operating point specified at vendor DT. But I managed to extract some 
more information from vendor GPU driver. More specifically, from this snippet, 
located in modules/gpu/mali-midgard/kernel_mode/driver/drivers/gpu/arm/
midgard/platform/sunxi/mali_kbase_config_sunxi.c:

pll_freq = target->freq;
while (pll_freq < 288000000)
	pll_freq *= 2;

err = clk_set_rate(sunxi_mali->gpu_pll_clk, pll_freq);
<...>
err = clk_set_rate(kbdev->clock, target->freq);
<...>

Apparently, minimum stable PLL frequency is 288 MHz (this should be added) and 
divider in peripheral clock can really be used, although preferably not. 
Vendor GPU operating points specify only 2 lower than 288 MHz points - at 264 
MHz and 216 MHz. I'm fully aware that they may not be really stable and given 
that these two and next two all share minimum voltage of 810 mV, power and 
thermal savings are probably not that great, so we can skip them and pin 
peripheral divider to 1, as you already did.

Another discrepancy I see is that vendor DT has two operating points, at 336 
MHz and 384 MHz, which also use factor P (also known as d2 in vendor clock 
source). This can be again an oversight or alternatively, it can be that P 
factor can actually be used, but just with lower frequencies.

Can you please make another test with GPU operating points specified in DT and 
check if it works with P factor left in?

For reference, vendor DT has following operating points (kHz, uV):
756000 1040000
624000 950000
576000 930000
540000 910000
504000 890000
456000 870000
432000 860000
420000 850000
408000 840000
384000 830000
360000 820000
336000 810000
312000 810000
264000 810000
216000 810000

Best regards,
Jernej

> 
> Signed-off-by: Roman Stratiienko <r.stratiienko@...il.com>
> ---
>  drivers/clk/sunxi-ng/ccu-sun50i-h6.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c index 2ddf0a0da526f..d941238cd178a
> 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> @@ -95,13 +95,14 @@ static struct ccu_nkmp pll_periph1_clk = {
>  	},
>  };
> 
> +/* For GPU PLL, using an output divider for DFS causes system to fail */
>  #define SUN50I_H6_PLL_GPU_REG		0x030
>  static struct ccu_nkmp pll_gpu_clk = {
>  	.enable		= BIT(31),
>  	.lock		= BIT(28),
>  	.n		= _SUNXI_CCU_MULT_MIN(8, 8, 12),
>  	.m		= _SUNXI_CCU_DIV(1, 1), /* input divider */
> -	.p		= _SUNXI_CCU_DIV(0, 1), /* output divider 
*/
> +	.max_rate	= 756000000UL,
>  	.common		= {
>  		.reg		= 0x030,
>  		.hw.init	= CLK_HW_INIT("pll-gpu", "osc24M",
> @@ -294,12 +295,9 @@ static SUNXI_CCU_M_WITH_MUX_GATE(deinterlace_clk,
> "deinterlace", static SUNXI_CCU_GATE(bus_deinterlace_clk,
> "bus-deinterlace", "psi-ahb1-ahb2", 0x62c, BIT(0), 0);
> 
> -static const char * const gpu_parents[] = { "pll-gpu" };
> -static SUNXI_CCU_M_WITH_MUX_GATE(gpu_clk, "gpu", gpu_parents, 0x670,
> -				       0, 3,	/* M */
> -				       24, 1,	/* mux */
> -				       BIT(31),	/* gate */
> -				       CLK_SET_RATE_PARENT);
> +/* GPU_CLK divider kept disabled to avoid interferences with DFS */
> +static SUNXI_CCU_GATE(gpu_clk, "gpu", "pll-gpu", 0x670,
> +		      BIT(31), CLK_SET_RATE_PARENT);
> 
>  static SUNXI_CCU_GATE(bus_gpu_clk, "bus-gpu", "psi-ahb1-ahb2",
>  		      0x67c, BIT(0), 0);




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ