lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Feb 2008 13:02:25 +0900
From:	Tejun Heo <htejun@...il.com>
To:	Max Krasnyansky <maxk@...lcomm.com>
CC:	rusty@...tcorp.com.au, Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Module loading/unloading and "The Stop Machine"

Hello, Max.

Max Krasnyansky wrote:
> I was hopping you could answer a couple of questions about module loading/unloading
> and the stop machine.
> There was a recent discussion on LKML about CPU isolation patches I'm working on.
> One of the patches makes stop machine ignore the isolated CPUs. People of course had
> questions about that. So I started looking into more details and got this silly, crazy 
> idea that maybe we do not need the stop machine any more :)
> 
> As far as I can tell the stop machine is basically a safety net in case some locking
> and recounting mechanisms aren't bullet proof. In other words if a subsystem can actually
> handle registration/unregistration in a robust way, module loader/unloader does not 
> necessarily have to halt entire machine in order to load/unload a module that belongs
> to that subsystem. I may of course be completely wrong on that.

Nope, it's integral part of module reference counting.  When using
refcnt for object lifetime management, the last put should be atomic
against initial get of the object.  This is usually achieved by
acquiring the lock used for object lookup before putting or using
atomic_dec_and_lock().

For module reference counts, this means that try_module_get() and
try_stop_module() should be atomic.  Note that modules don't use simple
refcnt so the latter part isn't module_put() but the analogy still
works.  There are two ways to synchronize try_module_get() against
try_stop_module() - the traditional is to grab lock in try_module_get()
and use atomic_dec_and_lock() in try_stop_module(), which works but
performance-wise bad because try_module_get() is used way much more than
try_stop_module() is.  For example, an IO command can go through several
try_module_get()'s.

So, all the burden of synchronization is put onto try_stop_module().
Because all of the cpus on the machine are stopped and none of them has
been stopped in the middle of non-preemptible code, __try_stop_module()
is synchronized from try_module_get() even though all the
synchronization try_module_get() does is get_cpu().

> The problem with the stop machine is that it's a very very big gun :). In a sense that 
> it totally kills all the latencies and stuff since the entire machine gets halted while
> module is being (un)loaded. Which is a major issue for any realtime apps. Specifically 
> for CPU isolation the issue is that high-priority rt user-space thread prevents stop 
> machine threads from running and entire box just hangs waiting for it. 
> I'm kind of surprised that folks who use monster boxes with over 100 CPUs have not 
> complained. It's must be a huge hit for those machines to halt the entire thing. 
> 
> It seems that over the last few years most subsystems got much better at locking and 
> refcounting. And I'm hopping that we can avoid halting the entire machine these days.
> For CPU isolation in particular the solution is simple. We can just ignore isolated CPUs. 
> What I'm trying to figure out is how safe it is and whether we can avoid full halt 
> altogether.

Without the stop_machine call, there's no synchronization between
initial get and final put.  Things will break.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ