lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBixgju0YX=TLbOWO4s9uHNBMSmnV=xcVBJVfU1wqrM4Q@mail.gmail.com>
Date: Fri, 14 Mar 2025 17:06:50 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Hagar Hemdan <hagarhem@...zon.com>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, linux-kernel@...r.kernel.org, wuchi.zero@...il.com, 
	abuehaze@...zon.com
Subject: Re: [PATCH] /sched/core: Fix Unixbench spawn test regression

On Thu, 13 Mar 2025 at 10:21, Hagar Hemdan <hagarhem@...zon.com> wrote:
>
> On Wed, Mar 12, 2025 at 03:41:40PM +0100, Dietmar Eggemann wrote:
> > On 11/03/2025 17:35, Vincent Guittot wrote:
> > > On Mon, 10 Mar 2025 at 16:29, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> > >>
> > >> On 10/03/2025 14:59, Vincent Guittot wrote:
> > >>> On Thu, 6 Mar 2025 at 17:26, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> > >>>>
> > >>>> Hagar reported a 30% drop in UnixBench spawn test with commit
> > >>>> eff6c8ce8d4d ("sched/core: Reduce cost of sched_move_task when config
> > >>>> autogroup") on a m6g.xlarge AWS EC2 instance with 4 vCPUs and 16 GiB RAM
> > >>>> (aarch64) (single level MC sched domain) [1].
> > >>>>
> > >>>> There is an early bail from sched_move_task() if p->sched_task_group is
> > >>>> equal to p's 'cpu cgroup' (sched_get_task_group()). E.g. both are
> > >>>> pointing to taskgroup '/user.slice/user-1000.slice/session-1.scope'
> > >>>> (Ubuntu '22.04.5 LTS').
> > >>>
> > >>> Isn't this same use case that has been used by commit eff6c8ce8d4d to
> > >>> show the benefit of adding the test if ((group ==
> > >>> tsk->sched_task_group) ?
> > >>> Adding Wuchi who added the condition
> > >>
> > >> IMHO, UnixBench spawn reports a performance number according to how many
> > >> tasks could be spawned whereas, IIUC, commit eff6c8ce8d4d was reporting
> > >> the time spend in sched_move_task().
> > >
> > > But does not your patch revert the benefits shown in the figures of
> > > commit eff6c8ce8d4d ? It skipped sched_move task in do_exit autogroup
> > > and you adds it back
> >
> > Yeah, we do need the PELT update in sched_change_group()
> > (task_change_group_fair()) in the do_exit() path to get the 30% score
> > back in 'UnixBench spawn'. Even that means we need more time due to this
> > in sched_move_task().
> >
> > I retested this and it turns out that 'group == tsk->sched_task_group'
> > is only true when sched_move_task() is called from exit.
> >
> > So to get the score back for 'UnixBench spawn' we should rather revert
> > commit eff6c8ce8d4d.
> >
> > The analysis in my patch still holds though.
> >
> > If you guys agree I can send the revert with my analysis in the
> > patch-header.
> Agree. The follow up commit fa614b4feb5a ("sched: Simplify sched_move_task()")
> needs to be reverted as well.

Why do you think it should be reverted as well ?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ