lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Feb 2020 18:07:05 -0500
From:   Julien Desfossez <jdesfossez@...italocean.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>
Cc:     Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Dario Faggioli <dfaggioli@...e.com>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v4 00/19] Core scheduling v4

On 05-Feb-2020 04:28:18 PM, Tim Chen wrote:
> On 1/14/20 7:40 AM, Vineeth Remanan Pillai wrote:
> > On Mon, Jan 13, 2020 at 8:12 PM Tim Chen <tim.c.chen@...ux.intel.com> wrote:
> > 
> >> I also encountered kernel panic with the v4 code when taking cpu offline or online
> >> when core scheduler is running.  I've refreshed the previous patch, along
> >> with 3 other patches to fix problems related to CPU online/offline.
> >>
> >> As a side effect of the fix, each core can now operate in core-scheduling
> >> mode or non core-scheduling mode, depending on how many online SMT threads it has.
> >>
> >> Vineet, are you guys planning to refresh v4 and update it to v5?  Aubrey posted
> >> a port to the latest kernel earlier.
> >>
> > Thanks for the updated patch Tim.
> > 
> > We have been testing with v4 rebased on 5.4.8 as RC kernels had given us
> > trouble in the past. v5 is due soon and we are planning to release v5 when
> > 5.5 comes out. As of now, v5 has your crash fixes and Aubrey's changes
> > related to load balancing. We are investigating a performance issue with
> > high overcommit io intensive workload and also we are trying to see if
> > we can add synchronization during VMEXITs so that a guest vm cannot run
> > run alongside with host kernel. We also need to think about the userland
> > interface for corescheduling in preparation for upstreaming work.
> > 
> 
> Vineet,
> 
> Have you guys been able to make progress on the issues with I/O intensive workload?

I finally have some results with the following branch:
https://github.com/digitalocean/linux-coresched/tree/coresched/v4-v5.5.y

We tested the following classes of workloads in VMs (all vcpus in the
same cgroup/tag):
- linpack (pure CPU work)
- sysbench TPC-C (MySQL benchmark, good mix of CPU/net/disk)
  with/without noise VMs around
- FIO randrw VM with/without noise VMs around

Our "noise VMs" are 1-vcpu VMs running a simple workload that wakes up
every 30 seconds, sends a couple of metrics over a VPN and go back to
sleep. They use between 0% and 30% of CPU on the host all the time,
nothing sustained just ups and downs.

# linpack
3x 12-vcpus pinned on a 36 hwthreads NUMA node (with smt on):
- core scheduling manages to perform slightly better than the baseline
  by up to 20% in some cases !
- with nosmt (so 2:1 overcommit) the performance drop by 24%

# sysbench TPC-C
1x 12-vcpus MySQL server on each NUMA node, 48 client threads (running
on a different server):
- without noise: no performance difference between the 3 configurations
- with 96 noise VMs on each NUMA node:
  - Performance drops by 54% with core scheduling
  - Performance drops by 75% with nosmt
We write at about 130MB/s on disk with that test.

# FIO randrw 50%, 1 thread, libaio, bs=128k, iodepth=32
1x 12-vcpus FIO VM, usually only require up to 100% CPU overall (data
thread and all vcpus summed), we read and write at about 350MB/s
alone:
 - coresched drops 5%
 - nosmt drops 1%

1:1 vcpus vs hardware thread on the NUMA node (filled with noise VMs):
 - coresched drops 7%
 - nosmt drops 22%

3:1 ratio:
 - coresched drops 16%
 - nosmt drops 22%

5:1 ratio:
 - coresched drops 51%
 - nosmt drops 61%

So the main conclusion is that for all the test cases we have studied,
core scheduling performs better than nosmt ! This is different than what
we tested a while back, so it's looking really good !

Now I am looking for confirmation from others. Dario did you have time
to re-run your test suite against that same branch ?

After that, our next step is to trash all that with adding VMEXIT
synchronization points ;-)

Thanks,

Julien

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ