[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4A42A259.9000306@novell.com>
Date: Wed, 24 Jun 2009 18:02:01 -0400
From: Gregory Haskins <ghaskins@...ell.com>
To: David Howells <dhowells@...hat.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] slow-work: add (module*)work->owner to fix races
with module clients
David Howells wrote:
> Gregory Haskins <ghaskins@...ell.com> wrote:
>
>
>> I found this while working on KVM. I actually posted this patch with
>> a KVM
>> series yesterday and standalone earlier today, but neither seems to have
>> made it to the lists. I suspect there is an issue with git-mail/postfix
>> on my system.
>>
>
> Also, your mail client has damaged the whitespace in the patch.
>
Yeah, sorry about that. When git-mail was failing I cut-n-pasted into
thunderbird and it munged it a bit. v2 should be better as it came out
of git directly after I fixed the postfix misconfig.
>
>> struct slow_work {
>> + struct module *owner;
>>
>
> Can you add it to slow_work_ops instead?
>
Yeah, that makes sense.
>
>> work->ops->put_ref(work);
>> + barrier(); /* ensure that put_ref is not re-ordered with module_put =
>> */
>> + module_put(work->owner);
>>
>
> Ummm... Can it be? module_put() and put_ref() are both out of line - surely
> the compiler isn't allowed to reorder them? If it's the CPU doing it then
> barrier() isn't going to save you.
>
Good point. I added that at the last minute without engaging my brain.
:) Will remove.
> Note, however, that work may not be dereferenced like this after put_ref() is
> called, unless you're sure that there's still a reference outstanding.
>
>
Yeah, I noticed that too immediately after sending. It should be better
in v2 (which should be in your inbox already)
>> + if (!try_module_get(work->owner))
>> + goto cant_get_mod;
>>
>
> Note that this may result in a module getting stuck in unloading. It may need
> to do some work to complete the unload, and this will prevent that.
>
Can we set the stake in the ground that you can only call
slow_work_enqueue() from a module if you know that there is at least one
reference to the module being held? This seems like a core requirement
anyway.
The follow up question would be: if so, should we use __module_get()
instead ot try_module_get() to annotate that (in addition to a comment,
of course).
> A better way might be to have put_ref() return, say, a pointer to a completion
> struct, and if not NULL, have the caller of put_ref() call complete() on it.
> That way you don't need to touch the module count, but can have something in
> put_ref() keep track of when the last object is released and have its caller
> invoke a completion to celebrate this fact.
>
That sounds interesting, but I am not sure if we would get into a
similar conundrum or be awkward to manage. I am in a conf-call ATM so I
can't think clear enough to tell for sure. ;) Let me give it some
thought and get back to you, though.
Thanks David!
-Greg
Download attachment "signature.asc" of type "application/pgp-signature" (267 bytes)
Powered by blists - more mailing lists