lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190109180225.GA28624@redhat.com>
Date:   Wed, 9 Jan 2019 13:02:25 -0500
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Mike Galbraith <efault@....de>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/1] RFC: sched/fair: skip select_idle_sibling() in
 presence of sync wakeups

Hello Mike,

On Wed, Jan 09, 2019 at 05:19:48AM +0100, Mike Galbraith wrote:
> On Tue, 2019-01-08 at 22:49 -0500, Andrea Arcangeli wrote:
> > Hello,
> > 
> > we noticed some unexpected performance regressions in the scheduler by
> > switching the guest CPU topology from "-smp 2,sockets=2,cores=1" to
> > "-smp 2,sockets=1,cores=2".
*snip*
> > To test I used this trivial program.
> 
> Which highlights the problem.  That proggy really is synchronous, but

Note that I wrote the program only after the guest scheduler
regression was reported, purely in order to test the patch and to
reproduce the customer issue more easily (so I could see the effect by
just running top). The regression was reported by a real life customer
workload AFIK and it was caused by the idle balancing dropping the
sync information.

If it was just the lat_ctx type of workload like the program I
attached I wouldn't care either, but this was a localhost udp and tcp
(both bandwidth and latency) test that showed improvement by not
dropping to the sync information through idle core balancing during
wakeups.

There is no tuning to allow people to test the sync information with
real workloads, the only way is to rebuild the kernel with SCHED_MC=n
(which nobody should be doing because it has other drawbacks) or by
altering the vCPU topology. So for now we're working to restore the
standard only-sockets topology to shut off the idle balancing without
having to patch the guest scheduler, but this looked like a more
general problem that has room for improvement.

Ideally we should detect when the sync information is worth keeping
instead of always dropping it. Alternatively a sched_feat could be
added to achieve it manually.

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ