lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a322df842e0dc5646ef1198ea0bbe668d94646e.camel@posteo.de>
Date:   Tue, 24 Dec 2019 11:03:29 +0100
From:   Philipp Stanner <stanner@...teo.de>
To:     Juri Lelli <juri.lelli@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Hagen Pfeifer <hagen@...u.net>,
        mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de
Subject: Re: SCHED_DEADLINE with CPU affinity

On Wed, 20.11.2019, 09:50 +0100 Juri Lelli wrote:
> Hi Philipp,

Hey Juri,

thanks so far; we indeed could make it work with exclusive CPU-sets.

On 19/11/19 23:20, Philipp Stanner wrote:
> 
> > from implementing our intended architecture.
> > 
> > Now, the questions we're having are:
> > 
> >    1. Why does the kernel do this, what is the problem with
> > scheduling with
> >       SCHED_DEADLINE on a certain core? In contrast, how is it
> > handled when
> >       you have single core systems etc.? Why this artificial
> > limitation?
> 
> Please have also a look (you only mentioned manpage so, in case you
> missed it) at
> 
> https://elixir.bootlin.com/linux/latest/source/Documentation/scheduler/sched-deadline.rst#L667
> 
> and the document in general should hopefully give you the answer
> about
> why we need admission control and current limitations regarding
> affinities.
> 
> >    2. How can we possibly implement this? We don't want to use
> > SCHED_FIFO,
> >       because out-of-control tasks would freeze the entire
> > container.
> 
> I experimented myself a bit with this kind of setup in the past and I
> think I made it work by pre-configuring exclusive cpusets (similarly
> as
> what detailed in the doc above) and then starting containers inside
> such
> exclusive sets with podman run --cgroup-parent option.
> 
> I don't have proper instructions yet for how to do this (plan to put
> them together soon-ish), but please see if you can make it work with
> this hint.

I fear I have not understood quite well yet why this
"workaround" leads to (presumably) the same results as set_affinity
would. From what I have read, I understand it as follows: For
sched_dead, admission control tries to guarantee that the requested
policy can be executed. To do so, it analyzes the current workload
situation, taking especially the number of cores into account.

Now, with a pre-configured set, the kernel knows which tasks will run
on which core, therefore it's able to judge wether a process can be
deadline scheduled or not. But when using the default way, you could
start your processes as SCHED_OTHER, set SCHED_DEADLINE as policy and
later many of them could suddenly call set_affinity, desiring to run on
the same core, therefore provoking collisions.

Is my understanding of the situation correct?

Merry Christmas,
P.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ