linux-kernel - Re: [PATCH] soc: qcom: qmi: Signal the txn completion after releasing the mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eda306fc-1a92-4a2d-b13f-c3b59a39ef8d@quicinc.com>
Date:   Mon, 31 Jul 2023 20:49:09 +0530
From:   Pavan Kondeti <quic_pkondeti@...cinc.com>
To:     Praveenkumar I <quic_ipkumar@...cinc.com>
CC:     <agross@...nel.org>, <andersson@...nel.org>,
        <konrad.dybcio@...aro.org>, <quic_clew@...cinc.com>,
        <linux-arm-msm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <quic_varada@...cinc.com>, <quic_srichara@...cinc.com>
Subject: Re: [PATCH] soc: qcom: qmi: Signal the txn completion after
 releasing the mutex

On Mon, Jul 31, 2023 at 06:37:55PM +0530, Praveenkumar I wrote:
> txn is in #1 stack
> 
> Worker #1                                       Worker #2
> ********					*********
> 
> qmi_txn_wait(txn)                               qmi_handle_message
>    |                                                  |
>    |                                                  |
>  wait_for_complete(txn->complete)                    ....
>    |                                             mutex_lock(txn->lock)
>    |                                                  |
>  mutex_lock(txn->lock)                                |
>    .....                                         complete(txn->lock)
>    |                                             mutex_unlock(txn->lock)
>    |
>  mutex_unlock(txn->lock)
> 
> In this case above, while #2 is doing the mutex_unlock(txn->lock),
> in between releasing lock and doing other lock related wakeup, #2 gets
> scheduled out. As a result #1, acquires the lock, unlocks, also
> frees the txn also (where the lock resides)
> 
> Now #2, gets scheduled again and tries to do the rest of the lock
> related wakeup, but lock itself is invalid because txn itself is gone.
> 
> Fixing this, by doing the mutex_unlock(txn->lock) first and then
> complete(txn->lock) in #2
> 
> Fixes: 3830d0771ef6 ("soc: qcom: Introduce QMI helpers")
> Cc: stable@...r.kernel.org
> Signed-off-by: Sricharan Ramabadhran <quic_srichara@...cinc.com>
> Signed-off-by: Praveenkumar I <quic_ipkumar@...cinc.com>
> ---
>  drivers/soc/qcom/qmi_interface.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/qmi_interface.c b/drivers/soc/qcom/qmi_interface.c
> index 78d7361fdcf2..92e29db97359 100644
> --- a/drivers/soc/qcom/qmi_interface.c
> +++ b/drivers/soc/qcom/qmi_interface.c
> @@ -505,12 +505,13 @@ static void qmi_handle_message(struct qmi_handle *qmi,
>  				pr_err("failed to decode incoming message\n");
>  
>  			txn->result = ret;
> -			complete(&txn->completion);
>  		} else  {
>  			qmi_invoke_handler(qmi, sq, txn, buf, len);
>  		}
>  
>  		mutex_unlock(&txn->lock);
> +		if (txn->dest && txn->ei)
> +			complete(&txn->completion);
>  	} else {
>  		/* Create a txn based on the txn_id of the incoming message */
>  		memset(&tmp_txn, 0, sizeof(tmp_txn));

What happens in a remote scenario where the waiter gets timed out at the
very same time you are releasing the mutex but before calling
complete()? The caller might end up freeing txn structure and it results
in the same issue you are currently facing. 

Thanks,
Pavan