linux-kernel - Re: [PATCH] sched/psi: Optimise psi_group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240409223847.GE1057805@cmpxchg.org>
Date: Tue, 9 Apr 2024 18:38:47 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Tvrtko Ursulin <tursulin@...lia.com>
Cc: linux-kernel@...r.kernel.org, Tvrtko Ursulin <tursulin@...ulin.net>,
	Suren Baghdasaryan <surenb@...gle.com>,
	Peter Ziljstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, kernel-dev@...lia.com
Subject: Re: [PATCH] sched/psi: Optimise psi_group_change a bit

[ Oops, I still had an old mutt alias for Ingo's address. ]

Ingo, would you mind taking this through the scheduler tree?

On Fri, Mar 29, 2024 at 02:51:53PM -0400, Johannes Weiner wrote:
> On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tursulin@...ulin.net>
> > 
> > The current code loops over the psi_states only to call a helper which
> > then resolves back to the action needed for each state using a switch
> > statement. That is effectively creating a double indirection of a kind
> > which, given how all the states need to be explicitly listed and handled
> > anyway, we can simply remove. Both the for loop and the switch statement
> > that is.
> > 
> > The benefit is both in the code size and CPU time spent in this function.
> > YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
> > go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
> > 0x2c1.
> > 
> > Signed-off-by: Tvrtko Ursulin <tursulin@...ulin.net>
> > Cc: Johannes Weiner <hannes@...xchg.org>
> > Cc: Suren Baghdasaryan <surenb@...gle.com>
> > Cc: Peter Ziljstra <peterz@...radead.org>
> > Cc: linux-kernel@...r.kernel.org
> > Cc: kernel-dev@...lia.com
> 
> This is great.
> 
> Acked-by: Johannes Weiner <hannes@...xchg.org>
> 
> Ingo, would you mind please taking this through the scheduler tree? I
> think Peter is still out.
> 
> Remaining quote below.
> 
> Thanks
> 
> > ---
> >  kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
> >  1 file changed, 27 insertions(+), 27 deletions(-)
> > 
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index 7b4aa5809c0f..55720ecf420e 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -218,28 +218,32 @@ void __init psi_init(void)
> >  	group_init(&psi_system);
> >  }
> >  
> > -static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
> > +static u32 test_states(unsigned int *tasks, u32 state_mask)
> >  {
> > -	switch (state) {
> > -	case PSI_IO_SOME:
> > -		return unlikely(tasks[NR_IOWAIT]);
> > -	case PSI_IO_FULL:
> > -		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
> > -	case PSI_MEM_SOME:
> > -		return unlikely(tasks[NR_MEMSTALL]);
> > -	case PSI_MEM_FULL:
> > -		return unlikely(tasks[NR_MEMSTALL] &&
> > -			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
> > -	case PSI_CPU_SOME:
> > -		return unlikely(tasks[NR_RUNNING] > oncpu);
> > -	case PSI_CPU_FULL:
> > -		return unlikely(tasks[NR_RUNNING] && !oncpu);
> > -	case PSI_NONIDLE:
> > -		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
> > -			tasks[NR_RUNNING];
> > -	default:
> > -		return false;
> > +	const bool oncpu = state_mask & PSI_ONCPU;
> > +
> > +	if (tasks[NR_IOWAIT]) {
> > +		state_mask |= BIT(PSI_IO_SOME);
> > +		if (!tasks[NR_RUNNING])
> > +			state_mask |= BIT(PSI_IO_FULL);
> >  	}
> > +
> > +	if (tasks[NR_MEMSTALL]) {
> > +		state_mask |= BIT(PSI_MEM_SOME);
> > +		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
> > +			state_mask |= BIT(PSI_MEM_FULL);
> > +	}
> > +
> > +	if (tasks[NR_RUNNING] > oncpu)
> > +		state_mask |= BIT(PSI_CPU_SOME);
> > +
> > +	if (tasks[NR_RUNNING] && !oncpu)
> > +		state_mask |= BIT(PSI_CPU_FULL);
> > +
> > +	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
> > +		state_mask |= BIT(PSI_NONIDLE);
> > +
> > +	return state_mask;
> >  }
> >  
> >  static void get_recent_times(struct psi_group *group, int cpu,
> > @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
> >  {
> >  	struct psi_group_cpu *groupc;
> >  	unsigned int t, m;
> > -	enum psi_states s;
> >  	u32 state_mask;
> >  
> >  	groupc = per_cpu_ptr(group->pcpu, cpu);
> > @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
> >  		return;
> >  	}
> >  
> > -	for (s = 0; s < NR_PSI_STATES; s++) {
> > -		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
> > -			state_mask |= (1 << s);
> > -	}
> > +	state_mask = test_states(groupc->tasks, state_mask);
> >  
> >  	/*
> >  	 * Since we care about lost potential, a memstall is FULL
> > @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
> >  	/*
> >  	 * After we disable psi_group->enabled, we don't actually
> >  	 * stop percpu tasks accounting in each psi_group_cpu,
> > -	 * instead only stop test_state() loop, record_times()
> > +	 * instead only stop test_states() loop, record_times()
> >  	 * and averaging worker, see psi_group_change() for details.
> >  	 *
> >  	 * When disable cgroup PSI, this function has nothing to sync
> > @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
> >  	 * would see !psi_group->enabled and only do task accounting.
> >  	 *
> >  	 * When re-enable cgroup PSI, this function use psi_group_change()
> > -	 * to get correct state mask from test_state() loop on tasks[],
> > +	 * to get correct state mask from test_states() loop on tasks[],
> >  	 * and restart groupc->state_start from now, use .clear = .set = 0
> >  	 * here since no task status really changed.
> >  	 */
> > -- 
> > 2.44.0
> >