lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad2a59c680d361847f257cdae576dd479e6ab9ac.camel@surriel.com>
Date:   Thu, 20 Aug 2020 16:39:47 -0400
From:   Rik van Riel <riel@...riel.com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Paul Turner <pjt@...gle.com>,
        "vincent.guittot" <vincent.guittot@...aro.org>, kernel-team@...com,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "dietmar.eggeman" <dietmar.eggeman@....com>
Subject: Re: CFS flat runqueue proposal fixes/update

On Thu, 2020-08-20 at 16:56 +0200, Dietmar Eggemann wrote:
> Hi Rik,
> 
> On 31/07/2020 09:42, Rik van Riel wrote:
> 
> [...]
> 
> > Lets revisit the hierarchy from above, and assign priorities
> > to the cgroups, with the fixed point one being 1000. Lets
> > say cgroups A, A1, and B have priority 1000, while cgroup
> > A2 has priority 1.
> > 
> >         /\
> >        /  \
> >       A    B
> >      / \    \ 
> >     A1 A2   t3
> >    /     \
> >   t1     t2
> > 
> > One consequence of this is that when t1, t2, and t3 each
> > get a time slice, the vruntime of tasks t1 and t3 advances
> > at roughly the same speed as the clock time, while the
> > vruntime of task t2 advances 1000x faster.
> > 
> > This is fine if all three tasks continue to be runnable,
> > since t1, t2 and t3 each get their fair share of CPU time.
> > 
> > However, if t1 goes to sleep, t2 is the only thing running
> > inside cgroup A, which has the same priority as cgroup B,
> > and tasks t2 and t3 should be getting the same amount of
> > CPU time.
> > 
> > They eventually will, but not before task t3 has used up
> > enough CPU time to catch up with the enormous vruntime
> > advance that t2 just suffered.
> > 
> > That needs to be fixed, to get near-immediate convergence,
> > and not convergence after some unknown (potentially long)
> > period of time.
> 
> I'm trying to understand this issue in detail ...
> 
> Since t1 and t2 are single tasks in A1 and A2, this taskgroup level
> shouldn't matter for tick preemption after t1 went to sleep?
> 
> check_preempt_tick() is only invoked for 'cfs_rq->nr_running > 1'
> from
> entity_tick().
> 
> IMHO, tick preemption is handled between A and B and since they have
> the
> same cpu.weight (cpu.shares) t2 and t3 get the same time slice after
> t1
> went to sleep.
> 
> I think that here tick preemption happens in the 'if (delta_exec >
> ideal_runtime)' condition w/ delta_exec = curr->sum_exec_runtime -
> curr->prev_sum_exec_runtime.
> 
> Did I miss anything?

The issue happens with a flat runqueue, when t1 goes
to sleep, but t2 and t3 continue running.

We need to make sure the vruntime for t2 has not been
advanced so far into the future that cgroup A is unable
to get its fair share of CPU wihle t1 is sleeping.

-- 
All Rights Reversed.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ