lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080427183926.acb66fff.akpm@linux-foundation.org>
Date:	Sun, 27 Apr 2008 18:39:26 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Heiko Carstens <heiko.carstens@...ibm.com>
Cc:	Gautham R Shenoy <ego@...ibm.com>, Ingo Molnar <mingo@...e.hu>,
	Paul Jackson <pj@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: missing locking in sched_domains code

On Sun, 27 Apr 2008 23:12:24 +0200 Heiko Carstens <heiko.carstens@...ibm.com> wrote:

> From: Heiko Carstens <heiko.carstens@...ibm.com>
> 
> Concurrent calls to detach_destroy_domains and arch_init_sched_domains
> were prevented by the old scheduler subsystem cpu hotplug mutex. When
> this got converted to get_online_cpus() the locking got broken.
> Unlike before now several processes can concurrently enter the critical
> sections that were protected by the old lock.
> 
> So add a new sched_domains_mutex which protects these sections again.
> 
> Cc: Gautham R Shenoy <ego@...ibm.com>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Paul Jackson <pj@....com>
> Signed-off-by: Heiko Carstens <heiko.carstens@...ibm.com>
> ---
>  include/linux/sched.h |    2 ++
>  kernel/cpuset.c       |    2 ++
>  kernel/sched.c        |   11 +++++++++++
>  3 files changed, 15 insertions(+)
> 
> Index: linux-2.6/kernel/sched.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched.c
> +++ linux-2.6/kernel/sched.c
> @@ -7807,14 +7807,23 @@ match2:
>  	unlock_doms_cur();
>  }
>  
> +/*
> + * Protects against concurrent calls to detach_destroy_domains
> + * and arch_init_sched_domains.
> + */
> +DEFINE_MUTEX(sched_domains_mutex);
> +
>  #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
>  int arch_reinit_sched_domains(void)
>  {
> +	static DEFINE_MUTEX(arch_reinit_sched_domains_mutex);

leftover hunk.

>  	int err;
>  
>  	get_online_cpus();
> +	mutex_lock(&sched_domains_mutex);
>  	detach_destroy_domains(&cpu_online_map);
>  	err = arch_init_sched_domains(&cpu_online_map);
> +	mutex_unlock(&sched_domains_mutex);
>  	put_online_cpus();
>  
>  	return err;
> @@ -7932,10 +7941,12 @@ void __init sched_init_smp(void)
>  	BUG_ON(sched_group_nodes_bycpu == NULL);
>  #endif
>  	get_online_cpus();
> +	mutex_lock(&sched_domains_mutex);
>  	arch_init_sched_domains(&cpu_online_map);
>  	cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map);
>  	if (cpus_empty(non_isolated_cpus))
>  		cpu_set(smp_processor_id(), non_isolated_cpus);
> +	mutex_unlock(&sched_domains_mutex);
>  	put_online_cpus();
>  	/* XXX: Theoretical race here - CPU may be hotplugged now */
>  	hotcpu_notifier(update_sched_domains, 0);
> Index: linux-2.6/include/linux/sched.h
> ===================================================================
> --- linux-2.6.orig/include/linux/sched.h
> +++ linux-2.6/include/linux/sched.h
> @@ -809,6 +809,8 @@ struct sched_domain {
>  #endif
>  };
>  
> +extern struct mutex sched_domains_mutex;
> +
>  extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new,
>  				    struct sched_domain_attr *dattr_new);
>  extern int arch_reinit_sched_domains(void);
> Index: linux-2.6/kernel/cpuset.c
> ===================================================================
> --- linux-2.6.orig/kernel/cpuset.c
> +++ linux-2.6/kernel/cpuset.c
> @@ -684,7 +684,9 @@ restart:
>  rebuild:
>  	/* Have scheduler rebuild sched domains */
>  	get_online_cpus();
> +	mutex_lock(&sched_domains_mutex);
>  	partition_sched_domains(ndoms, doms, dattr);
> +	mutex_unlock(&sched_domains_mutex);
>  	put_online_cpus();
>  

It seems a bit fragile to take this lock in the caller without even adding
a comment at the callee site which documents the new locking rule.

It would be more robust to take the lock within partition_sched_domains().

partition_sched_domains() already covers itself with lock_doms_cur().  Can
we take that in arch_reinit_sched_domains() rather than adding the new lock?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ