[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1102f8c5-5bb0-4d77-b1b1-81a86533f0aa@ti.com>
Date: Wed, 3 Sep 2025 15:13:49 +0530
From: Swamil Jain <s-jain1@...com>
To: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>,
Maxime Ripard
<mripard@...nel.org>
CC: <h-shenoy@...com>, <devarsht@...com>, <vigneshr@...com>, <praneeth@...com>,
<u-kumar1@...com>, <dri-devel@...ts.freedesktop.org>,
<linux-kernel@...r.kernel.org>, <jyri.sarha@....fi>,
<maarten.lankhorst@...ux.intel.com>, <tzimmermann@...e.de>,
<airlied@...il.com>, <simona@...ll.ch>, <aradhya.bhatia@...ux.dev>
Subject: Re: [PATCH v5 2/3] drm/tidss: Remove max_pclk_khz from tidss display
features
Hi,
On 9/3/25 15:01, Tomi Valkeinen wrote:
> Hi,
>
> On 03/09/2025 11:38, Swamil Jain wrote:
>> Hi Tomi, Maxime,
>>
>> On 8/27/25 15:19, Tomi Valkeinen wrote:
>>> Hi,
>>>
>>> On 27/08/2025 12:27, Maxime Ripard wrote:
>>>> On Wed, Aug 27, 2025 at 11:49:22AM +0300, Tomi Valkeinen wrote:
>>>>> On 19/08/2025 22:21, Swamil Jain wrote:
>>>>>> From: Jayesh Choudhary <j-choudhary@...com>
>>>>>>
>>>>>> TIDSS hardware by itself does not have variable max_pclk for each VP.
>>>>>> The maximum pixel clock is determined by the limiting factor between
>>>>>> the functional clock and the PLL (parent to the VP/pixel clock).
>>>>>
>>>>> Hmm, this is actually not in the driver, is it? We're not limiting the
>>>>> pclk based on the fclk.
>>>>>
>>>>>> The limitation that has been modeled till now comes from the clock
>>>>>> (PLL can only be programmed to a particular max value). Instead of
>>>>>> putting it as a constant field in dispc_features, we can query the
>>>>>> DM to see if requested clock can be set or not and use it in
>>>>>> mode_valid().
>>>>>>
>>>>>> Replace constant "max_pclk_khz" in dispc_features with
>>>>>> max_successful_rate and max_attempted_rate, both of these in
>>>>>> tidss_device structure would be modified in runtime. In mode_valid()
>>>>>> call, check if a best frequency match for mode clock can be found or
>>>>>> not using "clk_round_rate()". Based on that, propagate
>>>>>> max_successful_rate and max_attempted_rate and query DM again only if
>>>>>> the requested mode clock is greater than max_attempted_rate. (As the
>>>>>> preferred display mode is usually the max resolution, driver ends up
>>>>>> checking the highest clock the first time itself which is used in
>>>>>> subsequent checks).
>>>>>>
>>>>>> Since TIDSS display controller provides clock tolerance of 5%, we use
>>>>>> this while checking the max_successful_rate. Also, move up
>>>>>> "dispc_pclk_diff()" before it is called.
>>>>>>
>>>>>> This will make the existing compatibles reusable if DSS features are
>>>>>> same across two SoCs with the only difference being the pixel clock.
>>>>>>
>>>>>> Fixes: 7246e0929945 ("drm/tidss: Add OLDI bridge support")
>>>>>> Reviewed-by: Devarsh Thakkar <devarsht@...com>
>>>>>> Signed-off-by: Jayesh Choudhary <j-choudhary@...com>
>>>>>> Signed-off-by: Swamil Jain <s-jain1@...com>
>>>>>> ---
>>>>>> drivers/gpu/drm/tidss/tidss_dispc.c | 85 ++++++++++++
>>>>>> +----------------
>>>>>> drivers/gpu/drm/tidss/tidss_dispc.h | 1 -
>>>>>> drivers/gpu/drm/tidss/tidss_drv.h | 11 +++-
>>>>>> 3 files changed, 47 insertions(+), 50 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/
>>>>>> tidss/tidss_dispc.c
>>>>>> index c0277fa36425..c2c0fe0d4a0f 100644
>>>>>> --- a/drivers/gpu/drm/tidss/tidss_dispc.c
>>>>>> +++ b/drivers/gpu/drm/tidss/tidss_dispc.c
>>>>>> @@ -58,10 +58,6 @@ static const u16
>>>>>> tidss_k2g_common_regs[DISPC_COMMON_REG_TABLE_LEN] = {
>>>>>> const struct dispc_features dispc_k2g_feats = {
>>>>>> .min_pclk_khz = 4375,
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_DPI] = 150000,
>>>>>> - },
>>>>>> -
>>>>>> /*
>>>>>> * XXX According TRM the RGB input buffer width up to 2560
>>>>>> should
>>>>>> * work on 3 taps, but in practice it only works up to 1280.
>>>>>> @@ -144,11 +140,6 @@ static const u16
>>>>>> tidss_am65x_common_regs[DISPC_COMMON_REG_TABLE_LEN] = {
>>>>>> };
>>>>>> const struct dispc_features dispc_am65x_feats = {
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_DPI] = 165000,
>>>>>> - [DISPC_VP_OLDI_AM65X] = 165000,
>>>>>> - },
>>>>>> -
>>>>>> .scaling = {
>>>>>> .in_width_max_5tap_rgb = 1280,
>>>>>> .in_width_max_3tap_rgb = 2560,
>>>>>> @@ -244,11 +235,6 @@ static const u16
>>>>>> tidss_j721e_common_regs[DISPC_COMMON_REG_TABLE_LEN] = {
>>>>>> };
>>>>>> const struct dispc_features dispc_j721e_feats = {
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_DPI] = 170000,
>>>>>> - [DISPC_VP_INTERNAL] = 600000,
>>>>>> - },
>>>>>> -
>>>>>> .scaling = {
>>>>>> .in_width_max_5tap_rgb = 2048,
>>>>>> .in_width_max_3tap_rgb = 4096,
>>>>>> @@ -315,11 +301,6 @@ const struct dispc_features dispc_j721e_feats = {
>>>>>> };
>>>>>> const struct dispc_features dispc_am625_feats = {
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_DPI] = 165000,
>>>>>> - [DISPC_VP_INTERNAL] = 170000,
>>>>>> - },
>>>>>> -
>>>>>> .scaling = {
>>>>>> .in_width_max_5tap_rgb = 1280,
>>>>>> .in_width_max_3tap_rgb = 2560,
>>>>>> @@ -376,15 +357,6 @@ const struct dispc_features dispc_am625_feats = {
>>>>>> };
>>>>>> const struct dispc_features dispc_am62a7_feats = {
>>>>>> - /*
>>>>>> - * if the code reaches dispc_mode_valid with VP1,
>>>>>> - * it should return MODE_BAD.
>>>>>> - */
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_TIED_OFF] = 0,
>>>>>> - [DISPC_VP_DPI] = 165000,
>>>>>> - },
>>>>>> -
>>>>>> .scaling = {
>>>>>> .in_width_max_5tap_rgb = 1280,
>>>>>> .in_width_max_3tap_rgb = 2560,
>>>>>> @@ -441,10 +413,6 @@ const struct dispc_features dispc_am62a7_feats
>>>>>> = {
>>>>>> };
>>>>>> const struct dispc_features dispc_am62l_feats = {
>>>>>> - .max_pclk_khz = {
>>>>>> - [DISPC_VP_DPI] = 165000,
>>>>>> - },
>>>>>> -
>>>>>> .subrev = DISPC_AM62L,
>>>>>> .common = "common",
>>>>>> @@ -1347,25 +1315,57 @@ static void
>>>>>> dispc_vp_set_default_color(struct dispc_device *dispc,
>>>>>> DISPC_OVR_DEFAULT_COLOR2, (v >> 32) & 0xffff);
>>>>>> }
>>>>>> +/*
>>>>>> + * Calculate the percentage difference between the requested pixel
>>>>>> clock rate
>>>>>> + * and the effective rate resulting from calculating the clock
>>>>>> divider value.
>>>>>> + */
>>>>>> +unsigned int dispc_pclk_diff(unsigned long rate, unsigned long
>>>>>> real_rate)
>>>>>> +{
>>>>>> + int r = rate / 100, rr = real_rate / 100;
>>>>>> +
>>>>>> + return (unsigned int)(abs(((rr - r) * 100) / r));
>>>>>> +}
>>>>>> +
>>>>>> +static int check_pixel_clock(struct dispc_device *dispc,
>>>>>> + u32 hw_videoport, unsigned long clock)
>>>>>> +{
>>>>>> + unsigned long round_clock;
>>>>>> +
>>>>>> + if (dispc->tidss->is_ext_vp_clk[hw_videoport])
>>>>>> + return 0;
>>>>>> +
>>>>>> + if (clock <= dispc->tidss->max_successful_rate[hw_videoport])
>>>>>> + return 0;
>>>>>> +
>>>>>> + if (clock < dispc->tidss->max_attempted_rate[hw_videoport])
>>>>>> + return -EINVAL;
>>>>>> +
>>>>>> + round_clock = clk_round_rate(dispc->vp_clk[hw_videoport], clock);
>>>>>> +
>>>>>> + if (dispc_pclk_diff(clock, round_clock) > 5)
>>>>>> + return -EINVAL;
>>>>>> +
>>>>>> + dispc->tidss->max_successful_rate[hw_videoport] = round_clock;
>>>>>> + dispc->tidss->max_attempted_rate[hw_videoport] = clock;
>>>>>
>>>>> I still don't think this logic is sound. This is trying to find the
>>>>> maximum clock rate, and optimize by avoiding the calls to
>>>>> clk_round_rate() if possible. That makes sense.
>>>>>
>>>>> But checking for the 5% tolerance breaks it, in my opinion. If we find
>>>>> out that the PLL can do, say, 100M, but we need pclk of 90M, the
>>>>> current
>>>>> maximum is still the 100M, isn't it?
>>>>
>>>> 5% is pretty large indeed. We've been using .5% in multiple drivers and
>>>> it proved to be pretty ok. I would advise you tu use it too.
>>>
>>> The 5% comes from OMAP DSS, where we had to do pixel clock with a few
>>> dividers and multipliers. The rates were quite coarse, and we ended up
>>> having quite a large tolerance.
>>>
>>> I think with tidss, we always have a PLL we control, so we should always
>>> have very exact clocks. So I'm fine with dropping it to .5%. However,
>>> this patch and series is about removing the a-bit-too-hardcoded VP clk
>>> max rate code in the driver, so I would leave everything else to another
>>> series.
>>>
>>>> It's not clear to me why avoiding a clk_round_rate() call is something
>>>> worth doing though?
>>>
>>> Hard to say if it's worth doing, someone should make some perf tests.
>>> However, afaik, the calls do go to the firmware, so it involves
>>> inter-processor calls. On OMAP DSS checking the clock rates was slow, as
>>> it involved lots of iterating with dividers and multipliers. Perhaps
>>> it's much faster here.
>>>
>>>> Even caching the maximum rate you have been able to reach before is
>>>> pretty fragile: if the PLL changes its rate, or if a sibling clock has
>>>> set some limits on what the PLL can do, your maximum isn't relevant
>>>> anymore.
>>>
>>> You're right, although afaik it should not happen with TI's SoCs. We
>>> would be in trouble anyway if that were the case (e.g. someone starts
>>> the camera, and suddenly we can't support 1080p anymore).
>>>
>>>> in other words, what's wrong with simply calling clk_round_rate() and
>>>> checking if it's within a .5% deviation?
>>>
>>> This started with discussions how to replace the hardcoded max VP clock
>>> rate (used to quickly weed out impossible rates), which in reality was
>>> actually PLL max clock rate. We don't know the PLL max rate, and can't
>>> query it, so this approach was taken.
>>>
>>>> At the very least, this should be explained in comments or the commit
>>>> message.
>>>
>>> I agree.
>>>
>>> Swamil, can you do some perf tests with clk_round_rate()? If it's fast
>>> (enough), it will simplify the driver.
>>
>> Average execution time is around 112 us.
>> Trace file including the execution time for clk_round_rate(): https://
>> gist.github.com/swamiljain/2abe86982cdeba1d69223d2d525e0cb6
>> It is better to reduce calls to clk_round_rate().
>>
>> Need your suggestions for a better approach.
>
> We can cache the clk_round_rate calls. Checking my monitor, there are 36
> modes it offers me, but only 20 different pclk rates. Also, we could
> have multiple clk_round_rate calls happening in the driver for the same
> mode, and that would also be handled.
>
> Even if clk_round_rate takes a bit long, it only happens once (I hope)
> when an app does a modeset, and multiple times when a display is
> connected, I wonder if 100 us is an issue?
>
> Just using clk_round_rate() without any tricks would simplify the driver
> nicely, so I think we should try to see if we can get that working.
>
Are you suggesting to just check each time, a clock can be set using
clk_round_rate() and not cache max_pclk? But then we have to make sure
round_rate should be within some range(5% for us).
> Do you know if there's anything to improve on the clock side, ti-sci or
> firmare?
Not sure.
Regards,
Swamil
>
> Tomi
>
Powered by blists - more mailing lists