lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad306275-cd38-e6ad-55cc-0f7c4bdfcecf@collabora.com>
Date:   Fri, 18 Feb 2022 10:16:51 +0100
From:   AngeloGioacchino Del Regno 
        <angelogioacchino.delregno@...labora.com>
To:     Mathieu Poirier <mathieu.poirier@...aro.org>
Cc:     bjorn.andersson@...aro.org, matthias.bgg@...il.com,
        pihsun@...omium.org, linux-remoteproc@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-mediatek@...ts.infradead.org, linux-kernel@...r.kernel.org,
        kernel@...labora.com
Subject: Re: [PATCH] rpmsg: mtk_rpmsg: Fix circular locking dependency

Il 17/02/22 20:03, Mathieu Poirier ha scritto:
> Hi Angelo,
> 
> On Fri, Jan 14, 2022 at 03:47:37PM +0100, AngeloGioacchino Del Regno wrote:
>> During execution of the worker that's used to register rpmsg devices
>> we are safely locking the channels mutex but, when creating a new
>> endpoint for such devices, we are registering a IPI on the SCP, which
>> then makes the SCP to trigger an interrupt, lock its own mutex and in
>> turn register more subdevices.
>> This creates a circular locking dependency situation, as the mtk_rpmsg
>> channels_lock will then depend on the SCP IPI lock.
>>
>> [   18.014514]  Possible unsafe locking scenario:
>> [   18.014515]        CPU0                    CPU1
>> [   18.014517]        ----                    ----
>> [   18.045467]   lock(&mtk_subdev->channels_lock);
>> [   18.045474]                                lock(&scp->ipi_desc[i].lock);
> 
> I spent well over an hour tracing through the meanders of the code to end up in
> scp_ipi_register() which, I think, leads to the above.  But from there I don't
> see how an IPI can come in and that tells me my assumption is wrong.
> 
> Can you give more details on the events that lead to the above?  I'm not saying
> there is no problem, I just need to understand it.
> 

Hi Mathieu,

I understand that following this flow without the assistance of the actual
hardware may be a little confusing, so, no worries.

drivers/remoteproc/mtk_scp.c - this driver manages the SCP (obviously, a
remote processor)
drivers/remoteproc/mtk_scp_ipi.c - public functions for kernel SCP IPC

Flow:
- MediaTek SCP gets probed
- RPMSG starts, we start probing "something", like google,cros-ec-rpmsg
- mtk_rpmsg: creates endpoint; IPI handler is registered here.

          ( more flow )

- mtk_rpmsg: mtk_rpmsg_ns_cb() -> mtk_rpmsg_create_device(), channel is
              added to the channels list, worker gets scheduled


Now for the part that produces the real issue:

label_a:

*** RPMSG MUTEX LOCK ***
- mtk_rpmsg: ## Go through multiple channels ##, call mtk_rpmsg_register_device()

- Registered device tries to communicate through RPMSG
- .send() or .trysend() (depending on the device) is called: send_ipi()
     *** SCP MUTEX LOCK ***
    - mtk_scp_ipi: Data written, ACK? ok -> return 0
     *** SCP MUTEX UNLOCK ***

- mtk_scp_ipi: **** INTERRUPT!!! **** New RPMSG NS available? -> create channel
           goto label_a;

*** RPMSG MUTEX UNLOCK ***


Pardon me for keeping some things in this flow implicit, but that was done to
simplify it as much as possible as to try to make you understand the situation.

Cheers,
Angelo

> Thanks,
> Mathieu
> 
>> [   18.228399]                                lock(&mtk_subdev->channels_lock);
>> [   18.228405]   lock(&scp->ipi_desc[i].lock);
>> [   18.264405]
>>
>> To solve this, simply unlock the channels_lock mutex before calling
>> mtk_rpmsg_register_device() and relock it right after, as safety is
>> still ensured by the locking mechanism that happens right after
>> through SCP.
>> Notably, mtk_rpmsg_register_device() does not even require locking.
>>
>> Fixes: 7017996951fd ("rpmsg: add rpmsg support for mt8183 SCP.")
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
>> ---
>>   drivers/rpmsg/mtk_rpmsg.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/rpmsg/mtk_rpmsg.c b/drivers/rpmsg/mtk_rpmsg.c
>> index 5b4404b8be4c..d1213c33da20 100644
>> --- a/drivers/rpmsg/mtk_rpmsg.c
>> +++ b/drivers/rpmsg/mtk_rpmsg.c
>> @@ -234,7 +234,9 @@ static void mtk_register_device_work_function(struct work_struct *register_work)
>>   		if (info->registered)
>>   			continue;
>>   
>> +		mutex_unlock(&subdev->channels_lock);
>>   		ret = mtk_rpmsg_register_device(subdev, &info->info);
>> +		mutex_lock(&subdev->channels_lock);
>>   		if (ret) {
>>   			dev_err(&pdev->dev, "Can't create rpmsg_device\n");
>>   			continue;
>> -- 
>> 2.33.1
>>



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ