lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1abf31a-7a4a-4f8d-bf48-6b826aa01197@nxp.com>
Date: Wed, 7 May 2025 10:10:53 +0800
From: Liu Ying <victor.liu@....com>
To: Luca Ceresoli <luca.ceresoli@...tlin.com>
Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Andrzej Hajda <andrzej.hajda@...el.com>,
 Neil Armstrong <neil.armstrong@...aro.org>, Robert Foss <rfoss@...nel.org>,
 Laurent Pinchart <Laurent.pinchart@...asonboard.com>,
 Jonas Karlman <jonas@...boo.se>, Jernej Skrabec <jernej.skrabec@...il.com>,
 Jagan Teki <jagan@...rulasolutions.com>, Shawn Guo <shawnguo@...nel.org>,
 Sascha Hauer <s.hauer@...gutronix.de>,
 Pengutronix Kernel Team <kernel@...gutronix.de>,
 Fabio Estevam <festevam@...il.com>, Douglas Anderson
 <dianders@...omium.org>, Chun-Kuang Hu <chunkuang.hu@...nel.org>,
 Krzysztof Kozlowski <krzk@...nel.org>, Anusha Srivatsa
 <asrivats@...hat.com>, Paul Kocialkowski <paulk@...-base.io>,
 Dmitry Baryshkov <lumag@...nel.org>, Hui Pu <Hui.Pu@...ealthcare.com>,
 Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
 dri-devel@...ts.freedesktop.org, asahi@...ts.linux.dev,
 linux-kernel@...r.kernel.org, chrome-platform@...ts.linux.dev,
 imx@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
 linux-mediatek@...ts.infradead.org, linux-amlogic@...ts.infradead.org,
 linux-renesas-soc@...r.kernel.org, platform-driver-x86@...r.kernel.org,
 linux-samsung-soc@...r.kernel.org, linux-arm-msm@...r.kernel.org,
 freedreno@...ts.freedesktop.org, linux-stm32@...md-mailman.stormreply.com
Subject: Re: [PATCH v2 30/34] drm/bridge: imx8qxp-pixel-combiner: convert to
 devm_drm_bridge_alloc() API

On 05/07/2025, Luca Ceresoli wrote:
> Hello Liu,

Hi Luca,

> 
> thanks for your further feedback.
> 
> On Tue, 6 May 2025 10:24:18 +0800
> Liu Ying <victor.liu@....com> wrote:
> 
>> On 04/30/2025, Luca Ceresoli wrote:
>>> Hello Liu,  
>>
>> Hi Luca,
>>
>>>
>>> On Tue, 29 Apr 2025 10:10:55 +0800
>>> Liu Ying <victor.liu@....com> wrote:
>>>   
>>>> Hi,
>>>>
>>>> On 04/25/2025, Luca Ceresoli wrote:  
>>>>> This is the new API for allocating DRM bridges.
>>>>>
>>>>> This driver embeds an array of channels in the main struct, and each
>>>>> channel embeds a drm_bridge. This prevents dynamic, refcount-based
>>>>> deallocation of the bridges.
>>>>>
>>>>> To make the new, dynamic bridge allocation possible:
>>>>>
>>>>>  * change the array of channels into an array of channel pointers
>>>>>  * allocate each channel using devm_drm_bridge_alloc()
>>>>>  * adapt the code wherever using the channels
>>>>>
>>>>> Signed-off-by: Luca Ceresoli <luca.ceresoli@...tlin.com>  
>>>
>>> [...]
>>>   
>>>>> @@ -345,8 +351,8 @@ static int imx8qxp_pc_bridge_probe(struct platform_device *pdev)
>>>>>  free_child:
>>>>>  	of_node_put(child);
>>>>>  
>>>>> -	if (i == 1 && pc->ch[0].next_bridge)
>>>>> -		drm_bridge_remove(&pc->ch[0].bridge);
>>>>> +	if (i == 1 && pc->ch[0]->next_bridge)    
>>>>
>>>> Since this patch makes pc->ch[0] and pc->ch[1] be allocated separately,
>>>> pc->ch[0] could be NULL if channel0 is not available, hence a NULL pointer
>>>> dereference here...  
>>>
>>> See below for this.
>>>   
>>>>> +		drm_bridge_remove(&pc->ch[0]->bridge);
>>>>>  
>>>>>  	pm_runtime_disable(dev);
>>>>>  	return ret;
>>>>> @@ -359,7 +365,7 @@ static void imx8qxp_pc_bridge_remove(struct platform_device *pdev)
>>>>>  	int i;
>>>>>  
>>>>>  	for (i = 0; i < 2; i++) {
>>>>> -		ch = &pc->ch[i];
>>>>> +		ch = pc->ch[i];
>>>>>  
>>>>>  		if (!ch->is_available)    
>>>>
>>>> ...and here too.  
>>>
>>> This is indeed a bug, I should have checked the pointer for being
>>> non-NULL.
>>>
>>> Looking at that more closely, I think the is_available flag can be
>>> entirely removed now. The allocation itself (ch != NULL) now is
>>> equivalent. Do you think my reasoning is correct?
>>>
>>> Ouch! After writing the previous paragraph I realized you proposed this
>>> a few lines below! OK, removing is_available. :)
>>>
>>> [...]
>>>   
>>>> On top of this patch series, this issue doesn't happen if I apply the below
>>>> change:  
>>>
>>> [...]
>>>   
>>>> @@ -351,7 +349,7 @@ static int imx8qxp_pc_bridge_probe(struct platform_device *pdev)
>>>>  free_child:
>>>>         of_node_put(child);
>>>>  
>>>> -       if (i == 1 && pc->ch[0]->next_bridge)
>>>> +       if (i == 1 && pc->ch[0])
>>>>                 drm_bridge_remove(&pc->ch[0]->bridge);  
>>>
>>> Unrelated to this patch, but as I looked at it more in depth now, I'm
>>> not sure this whole logic is robust, even in the original code.
>>>
>>> The 'i == 1' check here seems to mean "if some error happened when
>>> handling channel@1, that means channel@0 was successfully initialized,
>>> so let's clean up channel 0".
>>>
>>> However my understanding of the bindings is that device tree is allowed
>>> to have the channel@1 node before the channel@0 node (or even channel@1
>>> without channel@0, but that's less problematic here).
>>>
>>> In such case (channel@1 before channel@0), this would happen:
>>>
>>>  1. alloc and init ch[1], all OK
>>>  2. alloc and init ch[0], an error happens
>>>     (e.g. of_graph_get_remote_node() fails)
>>>
>>> So we'd reach the free_child: label, and we should call
>>> drm_bridge_remove() for ch[1]->bridge, but there's no code to do that.
>>>
>>> To be robust in such a case, I think both channels need to be checked
>>> independently, as the status of one does not imply the status of the
>>> other. E.g.:
>>>
>>>   for (i = 0; i < 2; i++)
>>>       if (pc->ch[i] && pc->ch[i]->next_bridge)
>>>           drm_bridge_remove(&pc->ch[i]->bridge);
>>>
>>> (which is similar to what .remove() does after the changes discussed in
>>> this thread, and which I have queued for v3)
>>>
>>> What's your opinion? Do you think I missed anything?  
>>
>> The pixel combiner DT node would be added in imx8-ss-dc{0,1}.dtsi, please
>> see the case for imx8-ss-dc0.dtsi introduced by an in-flight patch[1].  As
>> channel@{0,1} child nodes always exist(DT overlay cannot effectively delete
>> any of them) and channel@0 always comes first, there is no problematic case.
> 
> I'm not questioning what existing and future dts files (will) contain,
> and surely I don't see a good reason someone would write channel@1
> before channel@0.
> 
> My point is:
> 
>  - the bindings _allow_ channel1 before channel@0
>  - the error management code after the free_child label won't work
>    correctly if channel1 is before channel@0 in the device tree
> 
> IOW the driver is not robust against all legal device tree descriptions,
> and it could be easily made robust using the example code in my
> previous e-mail (quoted a few lines above).
> 
> If you agree about this I'll be happy to send a patch doing that change.
> If you think I'm wrong, I won't fight a battle. This topic is
> orthogonal to the change I'm introducing in this patch, and I can
> continue the conversion independently from this discussion.

I don't think it is necessary to do that change for now.  When someone
really comes across this issue, we may make the error management code
robust.

> 
>>> Thanks for taking the time to dig into this!  
>>
>> After looking into this patch and patch 31(though I've already provided my A-b)
>> more closely, I think the imx8qxp_pc and imx8{qm,qxp}_ldb main structures
>> should have the same life time with the embedded DRM bridges, because for
>> example the clk_apb clock in struct imx8qxp_pc would be accessed by the
>> imx8qxp_pc_bridge_mode_set DRM bridge callback.  But, IIUC, your patches extend
>> the life time for the embedded channel/bridge structures only, but not for the
>> main structures.  What do you think ?
> 
> I see you concern, but I'm sure the change I'm introducing is not
> creating the problem you are concerned about.
> 
> The key aspect is that my patch is merely changing the lifetime of the
> _allocation_ of the drm_bridge, not its usage. On drm_bridge_remove()
> the bridge is removed from its encoder chain and it is completely not
> reachable, both before and after my patch. With my patch it is not
> freed immediately, but it's just a piece of "wasted" memory that is
> still allocated until elsewhere in the kernel there are pointers to it,
> to avoid use-after-free.
> 
> With this explanation, do you think my patch is correct (after fixing
> the bug we already discussed of course)?

I tend to say your patch is not correct because we'll eventually make sure
that removing a bridge module is safe when doing atomic commit, which means
the main structures should have the same life time with the DRM bridges.

> 
> Best regards,
> Luca
> 

-- 
Regards,
Liu Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ