lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190531210816.GA24027@sinkpad>
Date:   Fri, 31 May 2019 17:08:16 -0400
From:   Julien Desfossez <jdesfossez@...italocean.com>
To:     Aaron Lu <aaron.lu@...ux.alibaba.com>
Cc:     Aubrey Li <aubrey.intel@...il.com>,
        Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Subhra Mazumdar <subhra.mazumdar@...cle.com>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v3 00/16] Core scheduling v3

> My first reaction is: when shell wakes up from sleep, it will
> fork date. If the script is untagged and those workloads are
> tagged and all available cores are already running workload
> threads, the forked date can lose to the running workload
> threads due to __prio_less() can't properly do vruntime comparison
> for tasks on different CPUs. So those idle siblings can't run
> date and are idled instead. See my previous post on this:
> 
> https://lore.kernel.org/lkml/20190429033620.GA128241@aaronlu/
> (Now that I re-read my post, I see that I didn't make it clear
> that se_bash and se_hog are assigned different tags(e.g. hog is
> tagged and bash is untagged).
> 
> Siblings being forced idle is expected due to the nature of core
> scheduling, but when two tasks belonging to two siblings are
> fighting for schedule, we should let the higher priority one win.
> 
> It used to work on v2 is probably due to we mistakenly
> allow different tagged tasks to schedule on the same core at
> the same time, but that is fixed in v3.

I confirm this is indeed what is happening, we reproduced it with a
simple script that only uses one core (cpu 2 and 38 are sibling on this
machine):

setup:
cgcreate -g cpu,cpuset:test
cgcreate -g cpu,cpuset:test/set1
cgcreate -g cpu,cpuset:test/set2
echo 2,38 > /sys/fs/cgroup/cpuset/test/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/test/cpuset.mems
echo 2,38 > /sys/fs/cgroup/cpuset/test/set1/cpuset.cpus
echo 2,38 > /sys/fs/cgroup/cpuset/test/set2/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/test/set1/cpuset.mems
echo 0 > /sys/fs/cgroup/cpuset/test/set2/cpuset.mems
echo 1 > /sys/fs/cgroup/cpu,cpuacct/test/set1/cpu.tag

In one terminal:
sudo cgexec -g cpu,cpuset:test/set1 sysbench --threads=1 --time=30
--test=cpu run

In another one:
sudo cgexec -g cpu,cpuset:test/set2 date

It's very clear that 'date' hangs until sysbench is done.

We started experimenting with marking a task on the forced idle sibling
if normalized vruntimes are equal. That way, at the next compare, if the
normalized vruntimes are still equal, it prefers the task on the forced
idle sibling. It still needs more work, but in our early tests it helps.

Thanks,

Julien

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ