linux-kernel - Re: [PATCH v4 4/5] soc: qcom: rpmh-rsc: Simplify locking by eliminating the per-TCS lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=XFE6it066_EHe2wTNnCkRX30Tp0teehOCDRzU0dk=q4w@mail.gmail.com>
Date:   Fri, 24 Apr 2020 09:47:53 -0700
From:   Doug Anderson <dianders@...omium.org>
To:     Stephen Boyd <swboyd@...omium.org>
Cc:     Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Matthias Kaehlcke <mka@...omium.org>,
        Maulik Shah <mkshah@...eaurora.org>,
        Evan Green <evgreen@...omium.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 4/5] soc: qcom: rpmh-rsc: Simplify locking by
 eliminating the per-TCS lock

Hi,

On Thu, Apr 23, 2020 at 7:48 PM Stephen Boyd <swboyd@...omium.org> wrote:
>
> Quoting Douglas Anderson (2020-04-22 14:55:02)
> > The rpmh-rsc code had both a driver-level lock (sometimes referred to
> > in comments as drv->lock) and a lock per-TCS.  The idea was supposed
> > to be that there would be times where you could get by with just
> > locking a TCS lock and therefor other RPMH users wouldn't be blocked.
> >
> > The above didn't work out so well.
> >
> > Looking at tcs_write() the bigger drv->lock was held for most of the
> > function anyway.  Only the __tcs_buffer_write() and
> > __tcs_set_trigger() calls were called without it the drv->lock.  It
>
> without holding the drv->lock
>
> > actually turns out that in tcs_write() we don't need to hold the
> > drv->lock for those function calls anyway even if the per-TCS lock
> > isn't there anymore.
>
> Why?

It's in the commit as comments, but I'll copy it here too for clarity.


> > Thus, from a tcs_write() point of view, the
> > per-TCS lock was useless.
> >
> > Looking at rpmh_rsc_write_ctrl_data(), only the per-TCS lock was held.
> > It turns out, though, that this function already needs to be called
> > with the equivalent of the drv->lock held anyway (we either need to
> > hold drv->lock as we will in a future patch or we need to know no
> > other CPUs could be running as happens today).  Specifically
> > rpmh_rsc_write_ctrl_data() might be writing to a TCS that has been
> > borrowed for writing an active transation but it never checks this.
> >
> > Let's eliminate this extra overhead and avoid possible AB BA locking
> > headaches.
> >
> > Suggested-by: Maulik Shah <mkshah@...eaurora.org>
> > Signed-off-by: Douglas Anderson <dianders@...omium.org>
> > ---
> >
> > diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
> > index e540e49fd61c..71cebe7fd452 100644
> > --- a/drivers/soc/qcom/rpmh-rsc.c
> > +++ b/drivers/soc/qcom/rpmh-rsc.c
> > @@ -581,24 +575,19 @@ static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
> >         if (IS_ERR(tcs))
> >                 return PTR_ERR(tcs);
> >
> > -       spin_lock_irqsave(&tcs->lock, flags);
> > -       spin_lock(&drv->lock);
> > +       spin_lock_irqsave(&drv->lock, flags);
> >         /*
> >          * The h/w does not like if we send a request to the same address,
> >          * when one is already in-flight or being processed.
> >          */
> >         ret = check_for_req_inflight(drv, tcs, msg);
> > -       if (ret) {
> > -               spin_unlock(&drv->lock);
> > -               goto done_write;
> > -       }
> > +       if (ret)
> > +               goto err;
>
> Nitpick: Usually 'goto err' is used for error paths, not unlock paths.
> Use 'goto unlock' for that.

I'm confused why this isn't an error path.  We're about to return a
non-zero value from the function (indicating an error) and we didn't
actually do the write.  Isn't this an error path?

In any case, "unlock" is fine with me so I'll change it, but maybe you
can help clarify so I don't make the same mistake next time.


> > -       tcs_id = find_free_tcs(tcs);
> > -       if (tcs_id < 0) {
> > -               ret = tcs_id;
> > -               spin_unlock(&drv->lock);
> > -               goto done_write;
> > -       }
> > +       ret = find_free_tcs(tcs);
> > +       if (ret < 0)
> > +               goto err;
> > +       tcs_id = ret;
> >
> >         tcs->req[tcs_id - tcs->offset] = msg;
> >         set_bit(tcs_id, drv->tcs_in_use);
> > @@ -612,13 +601,21 @@ static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
> >                 write_tcs_reg_sync(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, tcs_id, 0);
> >                 enable_tcs_irq(drv, tcs_id, true);
> >         }
> > -       spin_unlock(&drv->lock);
> > +       spin_unlock_irqrestore(&drv->lock, flags);
> >
> > +       /*
> > +        * These two can be done after the lock is released because:
> > +        * - We marked "tcs_in_use" under lock.
> > +        * - Once "tcs_in_use" has been marked nobody else could be writing
> > +        *   to these registers until the interrupt goes off.
> > +        * - The interrupt can't go off until we trigger.
>
> trigger via some function?

I was referring to the __tcs_set_trigger() function call two lines
below.  I'll clarify.



> > +        */
> >         __tcs_buffer_write(drv, tcs_id, 0, msg);
> >         __tcs_set_trigger(drv, tcs_id, true);
> >
> > -done_write:
> > -       spin_unlock_irqrestore(&tcs->lock, flags);
> > +       return 0;
> > +err:
> > +       spin_unlock_irqrestore(&drv->lock, flags);
> >         return ret;
> >  }
> >