linux-kernel - Re: [PATCH v10 3/3] soc: qcom: rpmh: Invoke rpmh

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAD=FV=UYpO2rSOoF-OdZd3jKfSZGKnpQJPoiE5fzH+u1uafS6g@mail.gmail.com>
Date:   Thu, 5 Mar 2020 14:20:03 -0800
From:   Doug Anderson <dianders@...omium.org>
To:     Maulik Shah <mkshah@...eaurora.org>
Cc:     Stephen Boyd <swboyd@...omium.org>,
        Matthias Kaehlcke <mka@...omium.org>,
        Evan Green <evgreen@...omium.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        Andy Gross <agross@...nel.org>,
        Rajendra Nayak <rnayak@...eaurora.org>,
        Lina Iyer <ilina@...eaurora.org>, lsrao@...eaurora.org
Subject: Re: [PATCH v10 3/3] soc: qcom: rpmh: Invoke rpmh_flush() for dirty caches

Hi,

On Thu, Mar 5, 2020 at 3:30 AM Maulik Shah <mkshah@...eaurora.org> wrote:
>
> >> +                       spin_unlock_irqrestore(&ctrlr->cache_lock, flags);
> >> +                       return -EINVAL;
> > nit: why not add "int ret = 0" to the top of the function, then here:
> >
> > if (rpmh_flush(ctrl))
> >   ret = -EINVAL;
> >
> > ...then at the end "return ret".  It avoids the 2nd copy of the unlock?
> Done.
> >
> > Also: Why throw away the return value of rpmh_flush and replace it
> > with -EINVAL?  Trying to avoid -EBUSY?  ...oh, should you handle
> > -EBUSY?  AKA:
> >
> > if (!psci_has_osi_support()) {
> >   do {
> >     ret = rpmh_flush(ctrl);
> >   } while (ret == -EBUSY);
> > }
>
> Done, the return value from rpmh_flush() can be -EAGAIN, not -EBUSY.
>
> i will update the comment accordingly and will include below change as well in next series.
>
> https://patchwork.kernel.org/patch/11364067/
>
> this should address for caller to not handle -EAGAIN.

A few issues, I guess.

1. I _think_ it's important that you enable interrupts between
retries.  If you're on the same CPU that the interrupt is routed to
and you were waiting for 'tcs_in_use' to be cleared you'll be in
trouble otherwise.  ...I think we need to audit all of the places that
are looping based on -EAGAIN and confirm that interrupts are enabled
between retries.  Before your patch series the only looping I see was
in rpmh_invalidate() and the lock wasn't held.  After your series it's
also in rpmh_flush() which is called under spin_lock_irqsave() which
will be a problem.

2. The RPMH code uses both -EBUSY and -EAGAIN so I looked carefully at
this again.  You're right that -EBUSY seems to be exclusively returned
by things only called by rpmh_rsc_send_data() and that function
handles the retries.  ...but looking at this made me find a broken
corner case with the "zero active tcs" case (assuming you care about
this case as per your other thread).  Specifically if you have "zero
active tcs" then get_tcs_for_msg() can call rpmh_rsc_invalidate()
which can return -EAGAIN.  That will return the -EAGAIN out of
tcs_write() into rpmh_rsc_send_data().  rpmh_rsc_send_data() only
handles -EBUSY, not -EAGAIN.

-Doug