[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR11MB3260026F33333A70AF2F02169627A@DM6PR11MB3260.namprd11.prod.outlook.com>
Date: Tue, 27 Jun 2023 16:12:51 +0000
From: "Deng, Pan" <pan.deng@...el.com>
To: Peter Zijlstra <peterz@...radead.org>,
"Lu, Aaron" <aaron.lu@...el.com>
CC: "Chen, Tim C" <tim.c.chen@...el.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Li, Tianyou" <tianyou.li@...el.com>, "Ma, Yu" <yu.ma@...el.com>,
"Zhu, Lipeng" <lipeng.zhu@...el.com>,
"Chen, Yu C" <yu.c.chen@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>
Subject: RE: [PATCH v2] sched/task_group: Re-layout structure to reduce false
sharing
> -----Original Message-----
> From: Peter Zijlstra <peterz@...radead.org>
> Sent: Tuesday, June 27, 2023 6:15 PM
> To: Lu, Aaron <aaron.lu@...el.com>
> Cc: Deng, Pan <pan.deng@...el.com>; Chen, Tim C <tim.c.chen@...el.com>;
> vincent.guittot@...aro.org; linux-kernel@...r.kernel.org; Li, Tianyou
> <tianyou.li@...el.com>; Ma, Yu <yu.ma@...el.com>; Zhu, Lipeng
> <lipeng.zhu@...el.com>; Chen, Yu C <yu.c.chen@...el.com>; Tim Chen
> <tim.c.chen@...ux.intel.com>
> Subject: Re: [PATCH v2] sched/task_group: Re-layout structure to reduce false
> sharing
>
> On Mon, Jun 26, 2023 at 01:47:56PM +0800, Aaron Lu wrote:
>
> > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index
> > > ec7b3e0a2b20..4fbd4b3a4bdd 100644
> > > --- a/kernel/sched/sched.h
> > > +++ b/kernel/sched/sched.h
> > > @@ -389,6 +389,19 @@ struct task_group { #endif #endif
> > >
> > > + struct rcu_head rcu;
> > > + struct list_head list;
> > > +
> > > + struct list_head siblings;
> > > + struct list_head children;
> > > +
> > > + /*
> > > + * To reduce false sharing, current layout is optimized to make
> > > + * sure load_avg is in a different cacheline from parent, rt_se
> > > + * and rt_rq.
> > > + */
>
> That comment is misleading I think; you don't particularly care about those
> fields more than any other active fields that would cause false sharing.
>
How about this one:
/*
* load_avg can also cause cacheline bouncing with parent, rt_se
* and rt_rq, current layout is optimized to make sure they are in
* different cachelines.
*/
> > > + struct task_group *parent;
> > > +
> >
> > I wonder if we can simply do:
> >
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index
> > ec7b3e0a2b20..31b73e8d9568 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -385,7 +385,9 @@ struct task_group {
> > * it in its own cacheline separated from the fields above which
> > * will also be accessed at each tick.
> > */
> > - atomic_long_t load_avg ____cacheline_aligned;
> > + struct {
> > + atomic_long_t load_avg;
> > + } ____cacheline_aligned_in_smp;
> > #endif
> > #endif
> >
> > This way it can make sure there is no false sharing with load_avg no
> > matter how the layout of this structure changes in the future.
>
> This. Also, ISTR there was a series to split this atomic across nodes; whatever
> happend to that, and can we still measure an improvement over this with that
> approach?
I just ran unixbench context-switching in 1 node with 40C/80T, without this change
perf c2c data shows c2c bouncing is still there, perf record data shows set_task_cpu
takes ~4.5% overall cycles. With this change, that false-sharing is resolved, and
set_task_cpu cycles drop to 0.5%.
Thanks
Pan
Powered by blists - more mailing lists