lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 18 Jun 2024 16:24:01 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Peter Zijlstra <peterz@...radead.org>
Cc: John Stultz <jstultz@...gle.com>, LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Frederic Weisbecker <frederic@...nel.org>,
	Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Daniel Bristot de Oliveira <bristot@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Joel Fernandes <joel@...lfernandes.org>, kernel-team@...roid.com
Subject: Re: [PATCH] RFC: sched: Rework task_sched_runtime to avoid calling
 update_rq_clock

On 06/18/24 09:04, Peter Zijlstra wrote:
> On Sun, Jun 16, 2024 at 11:36:16PM +0100, Qais Yousef wrote:
> 
> > > Which then gets me something like the (completely untested) below..
> > > 
> > > Hmm?
> > > 
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 0935f9d4bb7b..36aed99d6a6c 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -724,7 +724,6 @@ static void update_rq_clock_task(struct rq *rq, s64 delta)
> > >  
> > >  	rq->prev_irq_time += irq_delta;
> > >  	delta -= irq_delta;
> > > -	psi_account_irqtime(rq->curr, irq_delta);
> > >  	delayacct_irq(rq->curr, irq_delta);
> > >  #endif
> > >  #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
> > > @@ -5459,6 +5458,8 @@ void sched_tick(void)
> > >  
> > >  	sched_clock_tick();
> > >  
> > > +	psi_account_irqtime(curr, NULL, &rq->psi_irq_time);
> > > +
> > >  	rq_lock(rq, &rf);
> > >  
> > >  	update_rq_clock(rq);
> > > @@ -6521,6 +6524,7 @@ static void __sched notrace __schedule(unsigned int sched_mode)
> > >  		++*switch_count;
> > >  
> > >  		migrate_disable_switch(rq, prev);
> > > +		psi_account_irqtime(prev, next, &rq->psi_irq_time);
> > 
> > Hmm are prev and next swapped here? next == curr in my view if there's no
> > subtly I missed
> 
> This is before context_switch() so prev == current at this point.
> However, more importantly, the PSI thing accounts to its 'curr' group
> and that should very much be the outgoing task's group in this case.
> 
> That is, we need to make sure the outgoing group is up-to-date before
> switching to a new group.
> 
> Makes sense?

Yes, thanks! After reviewing the patch though I wondered if we care to handle
when the group changes while we're RUNNING. There will be no __schedule() then.
But it's a matter of how accurate this is supposed to be. In sched_move_task()
we dequeue with DEQUEUE_NOCLOCK which IIUC would have not caused
psi_account_irqtime() to be called on the move already. So we're doing the
same.

We can still maybe move this to sched_change_group() (keeping the psi group
change check you added)? This still LGTM anyway, but thought I'll spill it out.

> 
> > >  		psi_sched_switch(prev, next, !task_on_rq_queued(prev));
> > >  
> > >  		trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
> > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > > index 146baa91d104..65bba162408f 100644
> > > --- a/kernel/sched/psi.c
> > > +++ b/kernel/sched/psi.c
> > > @@ -991,22 +991,31 @@ void psi_task_switch(struct task_struct *prev, struct task_struct *next,
> > >  }
> > >  
> > >  #ifdef CONFIG_IRQ_TIME_ACCOUNTING
> > > -void psi_account_irqtime(struct task_struct *task, u32 delta)
> > > +void psi_account_irqtime(struct task_struct *curr, struct task_struct *prev, u64 *time)
> > >  {
> > > -	int cpu = task_cpu(task);
> > > +	int cpu = task_cpu(curr);
> > >  	struct psi_group *group;
> > >  	struct psi_group_cpu *groupc;
> > > -	u64 now;
> > > +	u64 now, irq;
> > > +	s64 delta;
> > >  
> > >  	if (static_branch_likely(&psi_disabled))
> > >  		return;
> > >  
> > > -	if (!task->pid)
> > > +	if (!curr->pid)
> > > +		return;
> > > +
> > > +	group = task_psi_group(curr);
> > > +	if( prev && task_psi_group(prev) == group)
> > 
> > nit: whitespace misplaced
> 
> Ha!, is that's all and it all works in one go it's awesome :-)
> 
> I'm still trying to learn to type again after switching keyboards. I've
> used a thinkpad keyboard (either on an actual laptop or the travel
> version on my desktop for nearly 20 years... Now I've picked up a split
> keyboard out of necessity (UHK 60 v2 for those interested) and muscle
> memory is still cursing me every single day.

I failed to use those split keyboards..

> 
> As a result, I now also cannot type on my laptop anymore, so lose-lose I
> suppose ... urgh.

You mean you're not using AI yet?! :-O

> 
> > LGTM otherwise.
> > 
> > Reviewed-by: Qais Yousef <qyousef@...alina.io>
> 
> Thanks!
> 
> John, can you write up a changelog with some pretty numbers and all
> that? Also, when you re-post, can you make sure to Cc the PSI folks
> (johannes and suren iirc, get_maintainers.pl seems to find them).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ