lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Tue, 11 Aug 2020 08:51:17 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     kernel test robot <rong.a.chen@...el.com>
Cc:     Oleg Nesterov <oleg@...hat.com>, Hugh Dickins <hughd@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [mm] 2a9127fcf2: hackbench.throughput -69.2% regression

On Tue, Aug 11, 2020 at 1:34 AM kernel test robot <rong.a.chen@...el.com> wrote:
>
> FYI, we noticed a -69.2% regression of hackbench.throughput due to commit:
>
> commit: 2a9127fcf2296674d58024f83981f40b128fffea ("mm: rewrite wait_on_page_bit_common() logic")
>
> in testcase: hackbench
>
> In addition to that, the commit also has significant impact on the following tests:

You can say that again. It's all over the map. with some benchmarks
showing huge improvement and some showing a lot of downside.

Which is not surprising, I guess. Waking things up earlier can cause
more of a thundering herd effect, and it looks like some path ends up
just going right back to sleep again, with voluntary_context_switches
growing by a factor of 25x, and involuntary_context_switches growing
by 110x if I read that right.

And the reason really does seem to be due to having a _lot_ more
runnable active threads:nr_running.avg increases by 2x, and
runnable_avg.min is 4x what it used to be.

I think this is more of a "Hugh load" - it was likely already scaling
the load past the machine limits, and the more aggressive wakeups just
made it go even further past what resources there were available.

The odd thing is that in the profile, wakup_up_common does show up,
but it has nothing to do with the page lock. It's the
unix_stream_sendmsg() waking up readers.

I wonder if it used to be synchronized more on the page lock, and now
it's past that, and we end up having a lot of readers on the same unix
domain socket, and we get a thundering herd there when the writer
comes along. Or something.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ