linux-kernel - Re: [PATCH] coresight: Defer probe when the child dev is not probed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2fd9c96a-fc86-7247-d13a-a5283bb82494@arm.com>
Date:   Tue, 1 Mar 2022 15:03:33 +0000
From:   Suzuki K Poulose <suzuki.poulose@....com>
To:     Jinlong Mao <quic_jinlmao@...cinc.com>,
        Mike Leach <mike.leach@...aro.org>
Cc:     Mathieu Poirier <mathieu.poirier@...aro.org>,
        Leo Yan <leo.yan@...aro.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Mao Jinlong <jinlmao@....qualcomm.com>,
        coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
        Tingwei Zhang <quic_tingweiz@...cinc.com>,
        Yuanfang Zhang <quic_yuanfang@...cinc.com>,
        Tao Zhang <quic_taozha@...cinc.com>,
        Hao Zhang <quic_hazha@...cinc.com>
Subject: Re: [PATCH] coresight: Defer probe when the child dev is not probed

Hi

On 01/03/2022 13:30, Jinlong Mao wrote:
> Hi Mike,
> 
> On 3/1/2022 9:15 PM, Mike Leach wrote:
>> Hi,
>>
>> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@...cinc.com> 
>> wrote:
>>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>>

...

>>>
>>> Hi Suzuki,
>>>
>>> This issue happens when race condition happens.
>>> The condition is that the device and its child_device's probe happens 
>>> at the same time.
>>>
>>> For example: device0 and its child device device1.
>>> Both of them are calling coresight_register function. device0 is 
>>> calling coresight_fixup_device_conns.
>>> device1 is waiting for device0 to release the coresight_mutex. 
>>> Because device1's csdev node is allocated,
>>> coresight_make_links will be called for device0. Then in 
>>> coresight_add_sysfs_link, has_conns_grp is true
>>> for device0, but has_conns_grp is false for device1 as has_conns_grp 
>>> is set to true in coresight_create_conns_sysfs_group .
>>> The probe of device0 will fail for at this condition.
>>>
>>>
>>> struct coresight_device *coresight_register(struct coresight_desc *desc)
>>> {
>>>     .........
>>>      mutex_lock(&coresight_mutex);
>>>
>>>      ret = coresight_create_conns_sysfs_group(csdev);
>>>      if (!ret)
>>>          ret = coresight_fixup_device_conns(csdev);
>>>      if (!ret)
>>>          ret = coresight_fixup_orphan_conns(csdev);
>>>      if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>>          cti_assoc_ops->add(csdev);
>>>
>>>      mutex_unlock(&coresight_mutex);
>>>
>>> .........
>>>
>>> }
>>>
>>> static int coresight_fixup_device_conns(struct coresight_device *csdev)
>>> {
>>>     ..........
>>>          conn->child_dev =
>>>              coresight_find_csdev_by_fwnode(conn->child_fwnode);
>> The issue appears to be a constraint hidden in the lower layers of the 
>> code.
>> Would a better solution not be to alter the code here:
>>
>> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>>     ...
>> } else {
>>        csdev->orphan = true;
>> }
>>
>> which would mean that the connection attempt would drop through to
>> label the connection as an orphan, to be cleaned up by the child
>> itself when it runs coresight_fixup_orphan_conns()
>>

Tnanks Mike, I think that is a good solution. Alternatively, we
could make sure that device_register() and the fixup following
that are atomic.

i.e.

	mutex_lock()

	device_register()
	fixup_connections()
	create_sysfs()

	mutex_unlock();

The fix may be a bit invasive than Mike's proposal, but it makes
sure we don't end up with half baked device on the coresight-bus.

Suzuki