linux-kernel - Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181207175330.GC11430@edgewater-inn.cambridge.arm.com>
Date:   Fri, 7 Dec 2018 17:53:31 +0000
From:   Will Deacon <will.deacon@....com>
To:     Alexander Van Brunt <avanbrunt@...dia.com>
Cc:     Ashish Mhetre <amhetre@...dia.com>,
        "mark.rutland@....com" <mark.rutland@....com>,
        "linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>,
        Sachin Nikam <Snikam@...dia.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit

On Thu, Dec 06, 2018 at 08:42:03PM +0000, Alexander Van Brunt wrote:
> > > > If we roll a TLB invalidation routine without the trailing DSB, what sort of
> > > > performance does that get you?
> > > 
> > > It is not as good. In some cases, it is really bad. Skipping the invalidate was
> > > the most consistent and fast implementation.
> 
> > My problem with that is it's not really much different to just skipping the
> > page table update entirely. Skipping the DSB is closer to what is done on
> > x86, where we bound the stale entry time to the next context-switch.
> 
> Which of the three implementations is the "that" and "it" in the first sentence?

that = it = skipping the whole invalidation + the DSB

> > Given that I already queued the version without the DSB, we have the choice
> > to either continue with that or to revert it and go back to the previous
> > behaviour. Which would you prefer?
> 
> To me, skipping the DSB is a win over doing the invalidate and the DSB because
> it is faster on average.
> 
> DSBs have a big impact on the performance of other CPUs in the inner shareable
> domain because of the ordering requirements. For example, we have observed
> Cortex A57s stalling all CPUs in the cluster until Device accesses complete.
> 
> Would you be open to a patch on top of the DSB skipping patch that skips the
> whole invalidate?

I don't think so; we don't have an upper bound on how long we'll have a
stale TLB if remove the invalidation completely.

Will