linux-kernel - Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200911091647.54171.rjw@sisk.pl>
Date:	Mon, 9 Nov 2009 16:47:54 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Mike Galbraith <efault@....de>
Cc:	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	pm list <linux-pm@...ts.linux-foundation.org>,
	Greg KH <gregkh@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jesse Barnes <jbarnes@...tuousgeek.org>
Subject: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd

On Monday 09 November 2009, Mike Galbraith wrote:
> On Mon, 2009-11-09 at 15:27 +0100, Rafael J. Wysocki wrote:
> > On Monday 09 November 2009, Mike Galbraith wrote:
> > > On Mon, 2009-11-09 at 15:02 +0100, Thomas Gleixner wrote:
> > > > On Mon, 9 Nov 2009, Ingo Molnar wrote:
> > > > > 
> > > 
> > > > > ok, then my observation should not apply.
> > > > 
> > > > I think it _IS_ releated because the worker_thread is CPU affine and
> > > > the debug_smp_processor_id() check does:
> > > > 
> > > >     if (cpumask_equal(&current->cpus_allowed, cpumask_of(this_cpu)))
> > > > 
> > > > which prevents that usage of smp_processor_id() in ksoftirqd and
> > > > keventd in preempt enabled regions is warned on.
> > > > 
> > > > We saw exaclty the same back trace with fd21073 (sched: Fix affinity
> > > > logic in select_task_rq_fair()).
> > > > 
> > > > Rafael, can you please add a printk to debug_smp_processor_id() so we
> > > > can see on which CPU we are running ? I suspect we are on the wrong
> > > > one.
> > > 
> > > I wonder if that's not intimately related to the problem I had, namely
> > > newidle balancing offline CPUs as they're coming up, making a mess of
> > > cpu enumeration.
> > 
> > Very likely.  What did you do to fix it?
> 
> You don't really wanna know.  In 31 with newidle enabled, the below
> fixed it.  It won't fix 32, though it might cure the resume problem.

OK, I'll give it a try.

> diff --git a/kernel/sched.c b/kernel/sched.c
> index 1b59e26..6e71932 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4032,7 +4049,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  	unsigned long flags;
>  	struct cpumask *cpus = __get_cpu_var(load_balance_tmpmask);
>  
> -	cpumask_setall(cpus);
> +	cpumask_copy(cpus, cpu_online_mask);
>  
>  	/*
>  	 * When power savings policy is enabled for the parent domain, idle
> @@ -4195,7 +4212,7 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
>  	int all_pinned = 0;
>  	struct cpumask *cpus = __get_cpu_var(load_balance_tmpmask);
>  
> -	cpumask_setall(cpus);
> +	cpumask_copy(cpus, cpu_online_mask);
>  
>  	/*
>  	 * When power savings policy is enabled for the parent domain, idle

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/