lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Jun 2019 10:31:53 +0100
From:   Will Deacon <will.deacon@....com>
To:     Jayachandran Chandrasekharan Nair <jnair@...vell.com>
Cc:     Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        Jan Glauber <jglauber@...vell.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC] Disable lockref on arm64

Hi JC,

On Wed, Jun 12, 2019 at 04:10:20AM +0000, Jayachandran Chandrasekharan Nair wrote:
> On Wed, May 22, 2019 at 05:04:17PM +0100, Will Deacon wrote:
> > On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> > > On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> > > <jnair@...vell.com> wrote:
> > > > Looking thru the perf output of this case (open/close of a file from
> > > > multiple CPUs), I see that refcount is a significant factor in most
> > > > kernel configurations - and that too uses cmpxchg (without yield).
> > > > x86 has an optimized inline version of refcount that helps
> > > > significantly. Do you think this is worth looking at for arm64?
> > > >
> > > 
> > > I looked into this a while ago [0], but at the time, we decided to
> > > stick with the generic implementation until we encountered a use case
> > > that benefits from it. Worth a try, I suppose ...
> > > 
> > > [0] https://lore.kernel.org/linux-arm-kernel/20170903101622.12093-1-ard.biesheuvel@linaro.org/
> > 
> > If JC can show that we benefit from this, it would be interesting to see if
> > we can implement the refcount-full saturating arithmetic using the
> > LDMIN/LDMAX instructions instead of the current cmpxchg() loops.
> 
> Now that the lockref change is mainline, I think we need to take another
> look at this patch.

Before we get too involved with this, I really don't want to start a trend of
"let's try to rewrite all code using cmpxchg() in Linux because of TX2". At
some point, the hardware needs to play ball. However...

Ard's refcount patch was about moving the overflow check out-of-line. A
side-effect of this, is that we avoid the cmpxchg() operation from many of
the operations (atomic_add_unless() disappears), and it's /this/ which helps
you. So there may well be a middle ground where we avoid the complexity of
the out-of-line {over,under}flow handling but do the saturation post-atomic
inline.

I was hoping we could use LDMIN/LDMAX to maintain the semantics of
REFCOUNT_FULL, but now that I think about it I can't see how we could keep
the arithmetic atomic in that case. Hmm.

Whatever we do, I prefer to keep REFCOUNT_FULL the default option for arm64,
so if we can't keep the semantics when we remove the cmpxchg, you'll need to
opt into this at config time.

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ