linux-kernel - Re: [RFD] Freezing of kernel threads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070514061846.GA30625@in.ibm.com>
Date:	Mon, 14 May 2007 11:48:46 +0530
From:	Srivatsa Vaddagiri <vatsa@...ibm.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Gautham R Shenoy <ego@...ibm.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Pavel Machek <pavel@....cz>,
	Oleg Nesterov <oleg@...sign.ru>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Dipankar Sarma <dipankar@...ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Paul E McKenney <paulmck@...ibm.com>
Subject: Re: [RFD] Freezing of kernel threads

On Sun, May 13, 2007 at 09:33:41AM -0700, Linus Torvalds wrote:
> On Sun, 13 May 2007, Gautham R Shenoy wrote:
> > RFC #1: Use get_hot_cpus()- put_hot_cpus() , which follow the
> > well known refcounting model.
> 
> Yes. And usign the "preempt count" as a refcount is fairly natural, no? 
> We do already depend on that in many code-paths.
> 
> It also has the advantage since it's not a *blocking* lock, [...]

get/put_hot_cpus() was intended to be similar and not same as
get/put_cpu().

One difference is get_hot_cpus() has to be a blocking lock. It has to block 
when there is a cpu_down/up operation already in progress, otherwise it isn't 
of much help to serialize readers/writers. Note that a cpu_down/up is marked in 
progress *before* the first notifier is sent (CPU_DOWN/UP_PREPARE) and not just
when changing the cpu_online_map bitmap.

Because it can be a blocking call, there needs to be associated
machinery to wake up sleeping readers/writers.

The other complication get/put_hotcpu() had was dealing with
write-followed-by-read lock attempt by the *same* thread (whilst doing
cpu_down/up).  IIRC this was triggered by some callback processing in CPU_DEAD 
or CPU_DOWN_PREPARE.

cpu_down()
 |- take write lock 
 |- CPU_DOWN_PREPARE
 |        |- foo() wants a read_lock 

Stupid as it sounds, it was really found to be happening!  Gautham, do you 
recall who that foo() was? Somebody in cpufreq I guess ..

Tackling that requires some state bit in task_struct to educate
read_lock to be a no-op if write lock is already held by the thread.

In summary, get/put_hot_cpu() will need to be (slightly) more complex than
something like get/put_cpu(). Perhaps this complexity was what put off
Andrew when he suggested the use of freezer (http://lkml.org/lkml/2006/11/1/400)

> For example, since all users of cpu_online_map should be pure *readers* 
> (apart from a couple of cases that actually bring up a CPU), you can do 
> things like
> 
> 	#define cpu_online_map check_cpu_online_map()
> 
> 	static inline cpumask_t check_cpu_online_map(void)
> 	{
> 		WARN_ON(!preempt_safe()); /* or whatever lock we decide on */
> 		return __real_cpu_online_map;
> 	}

I remember Rusty had a similar function to check for unsafe references
to cpu_online_map way back when cpu hotplug was being developed. It will
be a good idea to reintroduce that back.

> and it will nicely catch things like that.

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/