lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20221120233543.yerjypgym7laxe42@airbuntu>
Date:   Sun, 20 Nov 2022 23:35:43 +0000
From:   Qais Yousef <qyousef@...alina.io>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Xuewen Yan <xuewen.yan@...soc.com>, peterz@...radead.org,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, ke.wang@...soc.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/rt: Use cpu_active_mask to prevent
 rto_push_irq_work's dead loop

On 11/17/22 17:00, Steven Rostedt wrote:
> On Mon, 14 Nov 2022 20:04:53 +0800
> Xuewen Yan <xuewen.yan@...soc.com> wrote:
> 
> > +++ b/kernel/sched/rt.c
> > @@ -2219,6 +2219,7 @@ static int rto_next_cpu(struct root_domain *rd)
> >  {
> >  	int next;
> >  	int cpu;
> > +	struct cpumask tmp_cpumask;
> 
> If you have a machine with thousands of CPUs, this will likely kill the
> stack.
> 
> >  
> >  	/*
> >  	 * When starting the IPI RT pushing, the rto_cpu is set to -1,
> > @@ -2238,6 +2239,11 @@ static int rto_next_cpu(struct root_domain *rd)
> >  		/* When rto_cpu is -1 this acts like cpumask_first() */
> >  		cpu = cpumask_next(rd->rto_cpu, rd->rto_mask);
> >  
> > +		cpumask_and(&tmp_cpumask, rd->rto_mask, cpu_active_mask);
> > +		if (rd->rto_cpu == -1 && cpumask_weight(&tmp_cpumask) == 1 &&
> > +		    cpumask_test_cpu(smp_processor_id(), &tmp_cpumask))
> > +			break;
> > +
> 
> Kill the above.
> 
> >  		rd->rto_cpu = cpu;
> >  
> >  		if (cpu < nr_cpu_ids) {
> 
> Why not just add here:
> 
> 			if (!cpumask_test_cpu(cpu, cpu_active_mask))
> 				continue;
> 			return cpu;
> 		}
> 
> ?

Or this?


	diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
	index ed2a47e4ddae..5073e06e34c3 100644
	--- a/kernel/sched/rt.c
	+++ b/kernel/sched/rt.c
	@@ -2236,7 +2236,7 @@ static int rto_next_cpu(struct root_domain *rd)
		for (;;) {

			/* When rto_cpu is -1 this acts like cpumask_first() */
	-               cpu = cpumask_next(rd->rto_cpu, rd->rto_mask);
	+               cpu = cpumask_next_and(rd->rto_cpu, rd->rto_mask, cpu_active_mask);

			rd->rto_cpu = cpu;


I think we might be circumventing the root cause though. It looks like that
there are only 2 cpus online in the system and one of them going offline (cpu1)
while the other is being overloaded (cpu0), ending in ping-ponging the tasks?

If that's the case, it seems to me the rto state needs to be cleared for cpu0
and stop attempting to push tasks. Which I'd assume it usually happens but
there's a race that prevents it from realizing this.

Maybe something like this would be better? Assuming I understood the problem
correctly of course.


	diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
	index ed2a47e4ddae..d80072cc2596 100644
	--- a/kernel/sched/rt.c
	+++ b/kernel/sched/rt.c
	@@ -334,6 +334,10 @@ static inline void rt_set_overload(struct rq *rq)
		if (!rq->online)
			return;

	+       /* Overload is meaningless if we shrank to become a UP system */
	+       if (cpumask_weight(cpu_online_mask) == 1)
	+               return;
	+
		cpumask_set_cpu(rq->cpu, rq->rd->rto_mask);
		/*
		 * Make sure the mask is visible before we set

This could be working around the problem too, not sure. But the bug IMHO is
that if we shrink to a UP system, the whole push-pull mechanism should retire
temporarily. Which I assume this usually happens, but there's a race somewhere.


Thanks

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ