linux-kernel - Re: [Problem] Cache line starvation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181003082304.GL26858@hirez.programming.kicks-ass.net>
Date:   Wed, 3 Oct 2018 10:23:04 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     bigeasy@...utronix.de,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        daniel.wagner@...mens.com, Will Deacon <will.deacon@....com>,
        x86@...nel.org, Linus Torvalds <torvalds@...ux-foundation.org>,
        "H. Peter Anvin" <hpa@...or.com>, boqun.feng@...il.com,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: Re: [Problem] Cache line starvation

On Wed, Oct 03, 2018 at 08:51:50AM +0100, Catalin Marinas wrote:
> On Fri, 21 Sep 2018 at 13:22, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Fri, Sep 21, 2018 at 02:02:26PM +0200, Sebastian Andrzej Siewior wrote:
> > > We reproducibly observe cache line starvation on a Core2Duo E6850 (2
> > > cores), a i5-6400 SKL (4 cores) and on a NXP LS2044A ARM Cortex-A72 (4
> > > cores).
> > >
> > > The problem can be triggered with a v4.9-RT kernel by starting
> >
> > > Daniel reported that disabling ticket locks on 4.4 makes the problem go
> > > away, but he hasn't run a long time test yet and as we saw with 4.14 it can
> > > take quite a while.
> >
> > On 4.4 and 4.9 ARM64 still uses ticket locks. So I'm very interested to
> > know if the ticket locks on x86 really fix or just make it harder.
> >
> > I've been looking at qspinlock in the light of this and there is indeed
> > room for improvement. The ticket lock certainly is much simpler.
> 
> FWIW, in the qspinlock TLA+ model [1], if I replace the
> atomic_fetch_or() model with a try_cmpxchg loop, it violates the
> liveness properties with only 2 CPUs as one keeps locking/unlocking,
> hence changing the lock value, while the other repeatedly fails the
> cmpxchg. Your latest qspinlock patches seem to address this (couldn't
> get it to fail but the model is only sequentially consistent). Not
> sure that's what Sebastian is seeing but without your proposed
> qspinlock changes, ticket spinlocks may be a better bet for RT.

Right, and agreed. I did raise that point when you initially proposed
that fetch_or() for liveliness.