linux-kernel - Re: [PATCH] arm64/io: Don't use WZR in writel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKv+Gu9ddTYEVA=HNgc6Lbk5S03C12vXFewhqkWE1=gfOKjmXA@mail.gmail.com>
Date:   Mon, 18 Mar 2019 18:11:10 +0100
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     Robin Murphy <robin.murphy@....com>, Jens Axboe <axboe@...nel.dk>,
        Marc Gonzalez <marc.w.gonzalez@...e.fr>,
        Marc Zyngier <marc.zyngier@....com>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Jeffrey Hugo <jhugo@...eaurora.org>,
        MSM <linux-arm-msm@...r.kernel.org>,
        AngeloGioacchino Del Regno <kholk11@...il.com>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH] arm64/io: Don't use WZR in writel

On Mon, 18 Mar 2019 at 18:01, Russell King - ARM Linux admin
<linux@...linux.org.uk> wrote:
>
> On Mon, Mar 18, 2019 at 04:04:03PM +0000, Robin Murphy wrote:
> > On 12/03/2019 12:36, Marc Gonzalez wrote:
> > > On 24/02/2019 04:53, Bjorn Andersson wrote:
> > >
> > > > On Sat 23 Feb 10:37 PST 2019, Marc Zyngier wrote:
> > > >
> > > > > On Sat, 23 Feb 2019 18:12:54 +0000, Bjorn Andersson wrote:
> > > > > >
> > > > > > On Mon 11 Feb 06:59 PST 2019, Marc Zyngier wrote:
> > > > > >
> > > > > > > On 11/02/2019 14:29, AngeloGioacchino Del Regno wrote:
> > > > > > >
> > > > > > > > Also, just one more thing: yes this thing is going ARM64-wide and
> > > > > > > > - from my findings - it's targeting certain Qualcomm SoCs, but...
> > > > > > > > I'm not sure that only QC is affected by that, others may as well
> > > > > > > > have the same stupid bug.
> > > > > > >
> > > > > > > At the moment, only QC SoCs seem to be affected, probably because
> > > > > > > everyone else has debugged their hypervisor (or most likely doesn't
> > > > > > > bother with shipping one).
> > > > > > >
> > > > > > > In all honesty, we need some information from QC here: which SoCs are
> > > > > > > affected, what is the exact nature of the bug, can it be triggered from
> > > > > > > EL0. Randomly papering over symptoms is not something I really like
> > > > > > > doing, and is likely to generate problems on unaffected systems.
> > > > > >
> > > > > > The bug at hand is that the XZR is not deemed a valid source in the
> > > > > > virtualization of the SMMU registers. It was identified and fixed for
> > > > > > all platforms that are shipping kernels based on v4.9 or later.
> > > > >
> > > > > When you say "fixed": Do you mean fixed in the firmware? Or by adding
> > > > > a workaround in the shipped kernel?
> > > >
> > > > I mean that it's fixed in the firmware.
> > > >
> > > > > If the former, is this part of an official QC statement, with an
> > > > > associated erratum number?
> > > >
> > > > I don't know, will get back to you on this one.
> > > >
> > > > > Is this really limited to the SMMU accesses?
> > > >
> > > > Yes.
> > > >
> > > > > > As such Angelo's list of affected platforms covers the high-profile
> > > > > > ones. In particular MSM8996 and MSM8998 is getting pretty good support
> > > > > > upstream, if we can figure out a way around this issue.
> > > > >
> > > > > We'd need an exhaustive list of the affected SoCs, and work out if we
> > > > > can limit the hack to the SMMU driver (cc'ing Robin, who's the one
> > > > > who'd know about it).
> > > >
> > > > I will try to compose a list.
> > >
> > > FWIW, I have just been bitten by this issue. I needed to enable an SMMU to
> > > filter PCIe EP accesses to system RAM (or something). I'm using an APQ8098
> > > MEDIABOX dev board. My system hangs in arm_smmu_device_reset() doing:
> > >
> > >     /* Invalidate the TLB, just in case */
> > >     writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
> > >     writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
> > >
> > >
> > > With the 'Z' constraint, gcc generates:
> > >
> > >     str wzr, [x0]
> > >
> > > without the 'Z' constraint, gcc generates:
> > >
> > >     mov     w1, 0
> > >     str w1, [x0]
> > >
> > >
> > > I can work around the problem using the following patch:
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 045d93884164..93117519aed8 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -59,6 +59,11 @@
> > >   #include "arm-smmu-regs.h"
> > > +static inline void qcom_writel(u32 val, volatile void __iomem *addr)
> > > +{
> > > +   asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
> > > +}
> > > +
> > >   #define ARM_MMU500_ACTLR_CPRE             (1 << 1)
> > >   #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26)
> > > @@ -422,7 +427,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
> > >   {
> > >     unsigned int spin_cnt, delay;
> > > -   writel_relaxed(0, sync);
> > > +   qcom_writel(0, sync);
> > >     for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
> > >             for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> > >                     if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
> > > @@ -1760,8 +1765,8 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
> > >     }
> > >     /* Invalidate the TLB, just in case */
> > > -   writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
> > > -   writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
> > > +   qcom_writel(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
> > > +   qcom_writel(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
> > >     reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
> > >
> > >
> > >
> > > Can a quirk be used to work around the issue?
> > > Or can we just "pessimize" the 3 writes for everybody?
> > > (Might be cheaper than a test anyway)
> >
> > If it really is just the SMMU driver which is affected, we can work around
> > it for free (not counting the 'cost' of slightly-weird-looking code, of
> > course). If the diff below works as expected, I'll write it up properly.
> >
> > Robin.
> > ----->8-----
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 045d93884164..7ff29e33298f 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -422,7 +422,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device
> > *smmu,
> >  {
> >       unsigned int spin_cnt, delay;
> >
> > -     writel_relaxed(0, sync);
> > +     writel_relaxed((unsigned long)sync, sync);
> >       for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
> >               for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> >                       if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
> > @@ -681,7 +681,12 @@ static void arm_smmu_write_context_bank(struct
> > arm_smmu_device *smmu, int idx)
> >
> >       /* Unassigned context banks only need disabling */
> >       if (!cfg) {
> > -             writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
> > +             /*
> > +              * For Qualcomm reasons, we want to guarantee that we write a
> > +              * zero from a register which is not WZR. Fortunately, the cfg
> > +              * logic here plays right into our hands...
> > +              */
> > +             writel_relaxed((unsigned long)cfg, cb_base + ARM_SMMU_CB_SCTLR);
> >               return;
> >       }
> >
> > @@ -1760,8 +1765,8 @@ static void arm_smmu_device_reset(struct
> > arm_smmu_device *smmu)
> >       }
> >
> >       /* Invalidate the TLB, just in case */
> > -     writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
> > -     writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
> > +     writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_TLBIALLH);
> > +     writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
> >
> >       reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
> >
>
> Given what we've seen from Clang for futex stuff in 32-bit ARM, are
> you really sure that the above will not result in Clang still spotting
> that the value is zero and using a wzr for all these cases?
>

Yeah, it seems to me that even GCC would still be likely to treat cfg
as a constant zero when fulfilling the asm constraints if it occurs
inside a 'if (!cfg) {}' block.