lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 May 2022 12:03:24 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Waiman Long <longman@...hat.com>
Cc:     "ying.huang@...el.com" <ying.huang@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>, Aaron Lu <aaron.lu@...el.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        kernel test robot <oliver.sang@...el.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Michal Hocko <mhocko@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        kernel test robot <lkp@...el.com>,
        Feng Tang <feng.tang@...el.com>,
        Zhengjun Xing <zhengjun.xing@...ux.intel.com>,
        fengwei.yin@...el.com
Subject: Re: [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression

On Tue, May 10, 2022 at 11:47 AM Waiman Long <longman@...hat.com> wrote:>
> Qspinlock still has one head waiter spinning on the lock. This is much
> better than the original ticket spinlock where there will be n waiters
> spinning on the lock.

Oh, absolutely. I'm not saying we should look at going back. I'm more
asking whether maybe we could go even further..

> That is the cost of a cheap unlock. There is no way to eliminate all
> lock spinning unless we use MCS lock directly which will require a
> change in locking API as well as more expensive unlock.

So there's no question that unlock would be more expensive for the
contention case, since it would have to always not only clear the lock
itself, as well as update the noce it points to.

But does it actually require a change in the locking API?

The qspinlock slowpath already always allocates that mcs node (for
some definition of "always" - I am obviously ignoring all the trylock
cases both before and in the slowpath)

But yes, clearly the simply store-release of the current
queued_spin_unlock() wouldn't work as-is, and maybe the cost of
replacing it with something else is much more expensive than any
possible win.

I think the PV case already basically does that - replacing the the
"store release" with a much more complex sequence. No?

         Linus

Powered by blists - more mailing lists