linux-kernel - Re: Module loading/unloading and "The Stop Machine"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 21 Feb 2008 17:24:47 -0800
From:	Max Krasnyanskiy <maxk@...lcomm.com>
To:	Tejun Heo <htejun@...il.com>
CC:	rusty@...tcorp.com.au, Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Module loading/unloading and "The Stop Machine"

Hi Tejun,

> Max Krasnyansky wrote:
>> I was hopping you could answer a couple of questions about module loading/unloading
>> and the stop machine.
>> There was a recent discussion on LKML about CPU isolation patches I'm working on.
>> One of the patches makes stop machine ignore the isolated CPUs. People of course had
>> questions about that. So I started looking into more details and got this silly, crazy 
>> idea that maybe we do not need the stop machine any more :)
>>
>> As far as I can tell the stop machine is basically a safety net in case some locking
>> and recounting mechanisms aren't bullet proof. In other words if a subsystem can actually
>> handle registration/unregistration in a robust way, module loader/unloader does not 
>> necessarily have to halt entire machine in order to load/unload a module that belongs
>> to that subsystem. I may of course be completely wrong on that.
> 
> Nope, it's integral part of module reference counting.  When using
> refcnt for object lifetime management, the last put should be atomic
> against initial get of the object.  This is usually achieved by
> acquiring the lock used for object lookup before putting or using
> atomic_dec_and_lock().
> 
> For module reference counts, this means that try_module_get() and
> try_stop_module() should be atomic.  Note that modules don't use simple
> refcnt so the latter part isn't module_put() but the analogy still
> works.  There are two ways to synchronize try_module_get() against
> try_stop_module() - the traditional is to grab lock in try_module_get()
> and use atomic_dec_and_lock() in try_stop_module(), which works but
> performance-wise bad because try_module_get() is used way much more than
> try_stop_module() is.  For example, an IO command can go through several
> try_module_get()'s.
> 
> So, all the burden of synchronization is put onto try_stop_module().
> Because all of the cpus on the machine are stopped and none of them has
> been stopped in the middle of non-preemptible code, __try_stop_module()
> is synchronized from try_module_get() even though all the
> synchronization try_module_get() does is get_cpu().
Thanks for the info. I guess I missed that from the code. In any case that seems like a 
pretty heavy refcounting mechanism. In a sense that every time something is loaded or 
unloaded entire machine freezes, potentially for several milliseconds. Normally it's not a 
big deal. But once you get more and more CPUs and/or start using realtime apps this becomes
a big deal. And it's plain broken for the use case that I mentioned during CPU isolation 
discussions. ie When user-space thread(s) prevent stopmachine kthread from running, in which
case machine simply hangs until those user-space threads exit.

Initially I assumed that it had to do with subsystems registration/unregistration being
potentially unsafe if it's only for module ref counting there is gotta be a less expensive way.
I'll think some more about it.
 
>> The problem with the stop machine is that it's a very very big gun :). In a sense that 
>> it totally kills all the latencies and stuff since the entire machine gets halted while
>> module is being (un)loaded. Which is a major issue for any realtime apps. Specifically 
>> for CPU isolation the issue is that high-priority rt user-space thread prevents stop 
>> machine threads from running and entire box just hangs waiting for it. 
>> I'm kind of surprised that folks who use monster boxes with over 100 CPUs have not 
>> complained. It's must be a huge hit for those machines to halt the entire thing. 
>>
>> It seems that over the last few years most subsystems got much better at locking and 
>> refcounting. And I'm hopping that we can avoid halting the entire machine these days.
>> For CPU isolation in particular the solution is simple. We can just ignore isolated CPUs. 
>> What I'm trying to figure out is how safe it is and whether we can avoid full halt 
>> altogether.
> 
> Without the stop_machine call, there's no synchronization between
> initial get and final put.  Things will break.
Got it.
Thanks again for the explanation. I'll stare at the module code some more with what you said
in mind.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/