linux-kernel - Re: [PATCH V2 2/4] drivers: qcom: rpmh-rsc: avoid locking in the interrupt handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190724145251.GB18620@codeaurora.org>
Date:   Wed, 24 Jul 2019 08:52:51 -0600
From:   Lina Iyer <ilina@...eaurora.org>
To:     Stephen Boyd <swboyd@...omium.org>
Cc:     agross@...nel.org, bjorn.andersson@...aro.org,
        linux-arm-msm@...r.kernel.org, linux-soc@...r.kernel.org,
        rnayak@...eaurora.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org, dianders@...omium.org,
        mkshah@...eaurora.org
Subject: Re: [PATCH V2 2/4] drivers: qcom: rpmh-rsc: avoid locking in the
 interrupt handler

On Tue, Jul 23 2019 at 14:11 -0600, Stephen Boyd wrote:
>Quoting Lina Iyer (2019-07-22 14:53:38)
>> Avoid locking in the interrupt context to improve latency. Since we
>> don't lock in the interrupt context, it is possible that we now could
>> race with the DRV_CONTROL register that writes the enable register and
>> cleared by the interrupt handler. For fire-n-forget requests, the
>> interrupt may be raised as soon as the TCS is triggered and the IRQ
>> handler may clear the enable bit before the DRV_CONTROL is read back.
>>
>> Use the non-sync variant when enabling the TCS register to avoid reading
>> back a value that may been cleared because the interrupt handler ran
>> immediately after triggering the TCS.
>>
>> Signed-off-by: Lina Iyer <ilina@...eaurora.org>
>> ---
>
>I have to read this patch carefully. The commit text isn't convincing me
>that it is actually safe to make this change. It mostly talks about the
>performance improvements and how we need to fix __tcs_trigger(), which
>is good, but I was hoping to be convinced that not grabbing the lock
>here is safe.
>
>How do we ensure that drv->tcs_in_use is cleared before we call
>tcs_write() and try to look for a free bit? Isn't it possible that we'll
>get into a situation where the bitmap is all used up but the hardware
>has just received an interrupt and is going to clear out a bit and then
>an rpmh write fails with -EBUSY?
>
If we have a situation where there are no available free bits, we retry
and that is part of the function. Since we have only 2 TCSes avaialble
to write to the hardware and there could be multiple requests coming in,
it is a very common situation. We try and acquire the drv->lock and if
there are free TCS available and if available mark them busy and send
our requests. If there are none available, we keep retrying.

>>  drivers/soc/qcom/rpmh-rsc.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
>> index 5ede8d6de3ad..694ba881624e 100644
>> --- a/drivers/soc/qcom/rpmh-rsc.c
>> +++ b/drivers/soc/qcom/rpmh-rsc.c
>> @@ -242,9 +242,7 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
>>                 write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i, 0);
>>                 write_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, i, 0);
>>                 write_tcs_reg(drv, RSC_DRV_IRQ_CLEAR, 0, BIT(i));
>> -               spin_lock(&drv->lock);
>>                 clear_bit(i, drv->tcs_in_use);
>> -               spin_unlock(&drv->lock);
>>                 if (req)
>>                         rpmh_tx_done(req, err);
>>         }
>> @@ -304,7 +302,7 @@ static void __tcs_trigger(struct rsc_drv *drv, int tcs_id)
>>         enable = TCS_AMC_MODE_ENABLE;
>>         write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
>>         enable |= TCS_AMC_MODE_TRIGGER;
>> -       write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
>> +       write_tcs_reg(drv, RSC_DRV_CONTROL, tcs_id, enable);
>>  }
>>
>>  static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,