linux-kernel - Re: [RFC PATCH v3 00/16] Core scheduling v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191011114851.GA8750@aaronlu>
Date:   Fri, 11 Oct 2019 20:01:56 +0800
From:   Aaron Lu <aaron.lu@...ux.alibaba.com>
To:     Vineeth Remanan Pillai <vpillai@...italocean.com>
Cc:     Tim Chen <tim.c.chen@...ux.intel.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Dario Faggioli <dfaggioli@...e.com>,
        "Li, Aubrey" <aubrey.li@...ux.intel.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v3 00/16] Core scheduling v3

On Fri, Oct 11, 2019 at 07:32:48AM -0400, Vineeth Remanan Pillai wrote:
> > > The reason we need to do this is because, new tasks that gets created will
> > > have a vruntime based on the new min_vruntime and old tasks will have it
> > > based on the old min_vruntime
> >
> > I think this is expected behaviour.
> >
> I don't think this is the expected behavior. If we hadn't changed the root
> cfs->min_vruntime for the core rq, then it would have been the expected
> behaviour. But now, we are updating the core rq's root cfs, min_vruntime
> without changing the the vruntime down to the tree. To explain, consider
> this example based on your patch. Let cpu 1 and 2 be siblings. And let rq(cpu1)
> be the core rq. Let rq1->cfs->min_vruntime=1000 and rq2->cfs->min_vruntime=2000.
> So in update_core_cfs_rq_min_vruntime, you update rq1->cfs->min_vruntime
> to 2000 because that is the max. So new tasks enqueued on rq1 starts with
> vruntime of 2000 while the tasks in that runqueue are still based on the old
> min_vruntime(1000). So the new tasks gets enqueued some where to the right
> of the tree and has to wait until already existing tasks catch up the
> vruntime to
> 2000. This is what I meant by starvation. This happens always when we update
> the core rq's cfs->min_vruntime. Hope this clarifies.

Thanks for the clarification.

Yes, this is the initialization issue I mentioned before when core
scheduling is initially enabled. rq1's vruntime is bumped the first time
update_core_cfs_rq_min_vruntime() is called and if there are already
some tasks queued, new tasks queued on rq1 will be starved to some extent.

Agree that this needs fix. But we shouldn't need do this afterwards.

So do I understand correctly that patch1 is meant to solve the
initialization issue?