linux-kernel - Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFzjcKv40hT_aD4jsdERi+a1pLU2SWPTwvyT2PDnoOmSpQ@mail.gmail.com>
Date:   Mon, 16 Jul 2018 12:30:07 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Michael Ellerman <mpe@...erman.id.au>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        Alan Stern <stern@...land.harvard.edu>,
        andrea.parri@...rulasolutions.com,
        Will Deacon <will.deacon@....com>,
        Akira Yokosawa <akiyks@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Nick Piggin <npiggin@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and
 remove it for ordinary release/acquire

On Mon, Jul 16, 2018 at 7:40 AM Michael Ellerman <mpe@...erman.id.au> wrote:
>
> If the numbers can be trusted it is actually slower to put the sync in
> lock, at least on one of the machines:
>
>               Time
> lwsync_sync   84,932,987,977
> sync_lwsync   93,185,930,333

Very funky.

> I guess arguably it's not a very macro benchmark, but we have a
> context_switch benchmark in the tree[1] which we often use to tune
> things, and it degrades badly. It just spins up two threads and has them
> ping-pong using yield.

I hacked that up to run on x86, and it only is about 5% locking
overhead in my profiles. It's about 18% __switch_to, and a lot of
system call entry/exit, but not a lot of locking.

I'm actually surprised it is even that much locking, since it seems to
be single-cpu, so there should be no contention and the lock (which
seems to be

        rq = this_rq();
        rq_lock(rq, &rf);

in do_sched_yield()) should stay local to the cpu.

And for you the locking is apparently even _more_ noticeable.

But yes, a 10% regression on that context switch thing is huge. You
shouldn't do ping-pong stuff, but people kind of do.

              Linus