linux-kernel - Re: [PATCH V2 2/4] drivers: qcom: rpmh-rsc: avoid locking in the interrupt handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5d3769df.1c69fb81.55d03.aa33@mx.google.com>
Date:   Tue, 23 Jul 2019 13:11:10 -0700
From:   Stephen Boyd <swboyd@...omium.org>
To:     Lina Iyer <ilina@...eaurora.org>, agross@...nel.org,
        bjorn.andersson@...aro.org
Cc:     linux-arm-msm@...r.kernel.org, linux-soc@...r.kernel.org,
        rnayak@...eaurora.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org, dianders@...omium.org,
        mkshah@...eaurora.org, Lina Iyer <ilina@...eaurora.org>
Subject: Re: [PATCH V2 2/4] drivers: qcom: rpmh-rsc: avoid locking in the interrupt handler

Quoting Lina Iyer (2019-07-22 14:53:38)
> Avoid locking in the interrupt context to improve latency. Since we
> don't lock in the interrupt context, it is possible that we now could
> race with the DRV_CONTROL register that writes the enable register and
> cleared by the interrupt handler. For fire-n-forget requests, the
> interrupt may be raised as soon as the TCS is triggered and the IRQ
> handler may clear the enable bit before the DRV_CONTROL is read back.
> 
> Use the non-sync variant when enabling the TCS register to avoid reading
> back a value that may been cleared because the interrupt handler ran
> immediately after triggering the TCS.
> 
> Signed-off-by: Lina Iyer <ilina@...eaurora.org>
> ---

I have to read this patch carefully. The commit text isn't convincing me
that it is actually safe to make this change. It mostly talks about the
performance improvements and how we need to fix __tcs_trigger(), which
is good, but I was hoping to be convinced that not grabbing the lock
here is safe. 

How do we ensure that drv->tcs_in_use is cleared before we call
tcs_write() and try to look for a free bit? Isn't it possible that we'll
get into a situation where the bitmap is all used up but the hardware
has just received an interrupt and is going to clear out a bit and then
an rpmh write fails with -EBUSY?

>  drivers/soc/qcom/rpmh-rsc.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
> index 5ede8d6de3ad..694ba881624e 100644
> --- a/drivers/soc/qcom/rpmh-rsc.c
> +++ b/drivers/soc/qcom/rpmh-rsc.c
> @@ -242,9 +242,7 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
>                 write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i, 0);
>                 write_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, i, 0);
>                 write_tcs_reg(drv, RSC_DRV_IRQ_CLEAR, 0, BIT(i));
> -               spin_lock(&drv->lock);
>                 clear_bit(i, drv->tcs_in_use);
> -               spin_unlock(&drv->lock);
>                 if (req)
>                         rpmh_tx_done(req, err);
>         }
> @@ -304,7 +302,7 @@ static void __tcs_trigger(struct rsc_drv *drv, int tcs_id)
>         enable = TCS_AMC_MODE_ENABLE;
>         write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
>         enable |= TCS_AMC_MODE_TRIGGER;
> -       write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
> +       write_tcs_reg(drv, RSC_DRV_CONTROL, tcs_id, enable);
>  }
>  
>  static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,