lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <215b8c33-d803-4a4c-a7c4-f6be231e15a0@ideasonboard.com>
Date: Mon, 5 Jan 2026 13:50:28 +0200
From: Tomi Valkeinen <tomi.valkeinen+renesas@...asonboard.com>
To: David Laight <david.laight.linux@...il.com>,
 Marek Vasut <marek.vasut@...lbox.org>
Cc: dri-devel@...ts.freedesktop.org, kernel test robot <lkp@...el.com>,
 David Airlie <airlied@...il.com>,
 Geert Uytterhoeven <geert+renesas@...der.be>,
 Kieran Bingham <kieran.bingham+renesas@...asonboard.com>,
 Laurent Pinchart <laurent.pinchart+renesas@...asonboard.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Magnus Damm <magnus.damm@...il.com>, Maxime Ripard <mripard@...nel.org>,
 Simona Vetter <simona@...ll.ch>, Thomas Zimmermann <tzimmermann@...e.de>,
 linux-kernel@...r.kernel.org, linux-renesas-soc@...r.kernel.org
Subject: Re: [PATCH] drm/rcar-du: dsi: Clean up VCLK divider calculation

Hi,

On 02/01/2026 01:44, David Laight wrote:
> On Fri, 2 Jan 2026 00:25:54 +0100
> Marek Vasut <marek.vasut@...lbox.org> wrote:
> 
>> On 1/2/26 12:13 AM, David Laight wrote:
>>> On Thu, 1 Jan 2026 22:26:34 +0100
>>> Marek Vasut <marek.vasut@...lbox.org> wrote:
>>>   
>>>> On 1/1/26 12:42 PM, David Laight wrote:
>>>>
>>>> Hello David,
>>>>  
>>>>>>    static void rcar_mipi_dsi_pll_calc(struct rcar_mipi_dsi *dsi,
>>>>>>    				   unsigned long fin_rate,
>>>>>>    				   unsigned long fout_target,
>>>>>> -				   struct dsi_setup_info *setup_info)
>>>>>> +				   struct dsi_setup_info *setup_info,
>>>>>> +				   u16 vclk_divider)  
>>>>>
>>>>> There is no need for this to be u16, unsigned int will generate better code.  
>>>>
>>>> Can you please elaborate on the "better code" part ?  
>>>
>>> Basically the compiler has to generate extra code to ensure the value
>>> doesn't exceed 65535.
>>> So there will be a mask of the function parameter (passes in a 32bit register).
>>> Any arithmetic needs similar masking of the result.
>>> You won't see the latter (as much) on x86 (or m68k) because the compiler
>>> can use the C 'as if' rule and use the cpu's 8/16 bit registers and alu.
>>> But pretty much no other cpu can do that, so extra instructions are needed
>>> to stop the value (in a 32bit register) exceeding the limit for the variable.
>>>
>>> Remember that while C variables can be char/short the values they contain
>>> are promoted to 'signed int' before (pretty much) anything is done with
>>> the value, any calculated value is then truncated before being assigned back.
>>> For on-stack values this is fine - but modern compilers was to keep values
>>> in registers as much as possible.
>>>   
>>>>  
>>>>>>    {
>>>>>>    	unsigned int best_err = -1;
>>>>>>    	const struct rcar_mipi_dsi_device_info *info = dsi->info;
>>>>>> @@ -360,7 +373,7 @@ static void rcar_mipi_dsi_pll_calc(struct rcar_mipi_dsi *dsi,
>>>>>>    			if (fout < info->fout_min || fout > info->fout_max)
>>>>>>    				continue;
>>>>>>    
>>>>>> -			fout = div64_u64(fout, setup_info->vclk_divider);
>>>>>> +			fout = div64_u64(fout, vclk_divider);  
>>>>>
>>>>> Since vclk_divider is BIT_U32(div [+ 1]) this is just a shift right.
>>>>> So pass the bit number instead.  
>>>>
>>>> While I agree this is a shift in the end, it also makes the code harder
>>>> to understand. I can add the following change into V2, but I would like
>>>> input from Tomi/Laurent on whether we should really push the
>>>> optimization in that direction:  
>>>
>>> The shift really is a lot faster.
>>> Even on 64bit x86 cpus it can be slow - 35-88 clocks prior to 'Cannon Lake'.
>>> AMD cpu get better with Zen3, and using a 64bit divide with 32bits values
>>> doesn't slow things down (it does on the Intel cpu).
>>> Don't even think about what happens on 32bit cpus.
>>> I don't know about other architectures.
>>> Just comment as 'x >> n' rather than 'x / (1 << n)'
>>
>> Please note that this code is built primarily for arm64 , so discussing 
>> x86 or m68k optimizations doesn't make much sense here.
> 
> ARM definitely only has 32 and 64bit maths - so you will see masking
> instructions for arithmetic on char/short values (unless the compiler
> can tell that the values cannot get too large.)
> 
> Divide performance is cpu dependant - the only arm cpu I've used only
> had software divide (but a fast barrel shifter).
> If you think that Intel haven't thrown much silicon at integer divide
> it isn't that likely that arm will have a divide that is much faster
> than 1 clock for each bit in the quotient.
> (Of course I might be wrong...)
The division is done once when enabling the display? If so, I'd
prioritize readability. That said, division done with shift if quite
common, so I'm not sure if it would be that bad with a few comments.

 Tomi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ