linux-kernel - Re: [PATCH v3] slow-work: add (module*)work->owner to fix races with module clients

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A44D66A.9020404@novell.com>
Date:	Fri, 26 Jun 2009 10:08:42 -0400
From:	Gregory Haskins <ghaskins@...ell.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
CC:	dhowells@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] slow-work: add (module*)work->owner to fix races	with
 module clients

Michael S. Tsirkin wrote:
> On Fri, Jun 26, 2009 at 09:52:53AM -0400, Gregory Haskins wrote:
>   
>> Michael S. Tsirkin wrote:
>>     
>>> On Fri, Jun 26, 2009 at 08:00:45AM -0400, Gregory Haskins wrote:
>>>   
>>>       
>>>> Gregory Haskins wrote:
>>>>     
>>>>         
>>>>> (Try 3: applies to Linus' git master:626f380d)
>>>>>
>>>>> [ Changelog:
>>>>>
>>>>>     v3:
>>>>>         *) moved (module*)owner to slow_work_ops 
>>>>>         *) removed useless barrier()
>>>>> 	*) updated documentation/comments 
>>>>>
>>>>>     v2:
>>>>> 	*) cache "owner" value to prevent invalid access after put_ref
>>>>>
>>>>>     v1:
>>>>> 	*) initial release
>>>>> ]
>>>>>
>>>>>   
>>>>>       
>>>>>           
>>>> (I know there were several versions of this patch floating around.  This
>>>> was compounded by the fact that I had also originally submitted it as
>>>> part of a larger series against KVM and those problems I had with my
>>>> mailer.  But FWIW: This is the latest version to consider for merging to
>>>> mainline.  I've CC'd Michael Tsirkin who has reviewed this patch. 
>>>> Perhaps I can prod an Acked-by/Reviewed-by tag out of him ;) )
>>>>
>>>> Kind Regards,
>>>> -Greg
>>>>     
>>>>         
>>> The race itself seems to be real, and the patch looks good to me.
>>> There's ongoing discussion on whether KVM needs to use slow-work,
>>> but there are other modular users which will benefit from this.
>>>
>>> Reviewed-by: Michael S. Tsirkin <mst@...hat.com>
>>>
>>> By the way: I think you also need to update all users, which include
>>> at least GFS2 and fscache, to init the owner field.
>>>   
>>>       
>> Good catch!  That was a side effect of v3 since v2 used to have the
>> owner in the slow_work and do the init implicitly in slow_work_init(). 
>> Should I respin a v4 with those new hunks, or should we patch those
>> separately?
>>
>> -Greg
>>     
>
> I think you need v4 otherwise bisect will be broken.
>   

I have no problem with a v4, and lets plan on that.  However, note
bisectability wouldnt be an issue.  GCC would just assign .owner = NULL
if the client in question doesn't do it explicitly.  All this means is
that the clients in question would still be broken even if this patch
went in, but they would be no worse than they are today.

Note I am technically taking today off, so any respins will probably not
come out until next week.

Thanks guys,
-Greg
>   
>>>   
>>>       
>>>>> -------------------------
>>>>>
>>>>> slow-work: add (module*)work->owner to fix races with module clients
>>>>>
>>>>> The slow_work facility was designed to use reference counting instead of
>>>>> barriers for synchronization.  The reference counting mechanism is
>>>>> implemented as a vtable op (->get_ref, ->put_ref) callback.  This is
>>>>> problematic for module use of the slow_work facility because it is
>>>>> impossible to synchronize against the .text installed in the callbacks:
>>>>> There is no way to ensure that the slow-work threads have completely
>>>>> exited the .text in question and rmmod may yank it out from under the
>>>>> slow_work thread.
>>>>>
>>>>> This patch attempts to address this issue by mapping "struct module* owner"
>>>>> to the slow_work_ops item, and maintaining a module reference
>>>>> count coincident with the more externally visible reference count.  Since
>>>>> the slow_work facility is resident in kernel, it should be a race-free
>>>>> location to issue a module_put() call.  This will ensure that modules
>>>>> can properly cleanup before exiting.
>>>>>
>>>>> A module_get()/module_put() pair on slow_work_enqueue() and the subsequent
>>>>> dequeue technically adds the overhead of the atomic operations for every
>>>>> work item scheduled.  However, slow_work is designed for deferring
>>>>> relatively long-running and/or sleepy tasks to begin with, so this
>>>>> overhead will hopefully be negligible.
>>>>>
>>>>> Signed-off-by: Gregory Haskins <ghaskins@...ell.com>
>>>>> CC: David Howells <dhowells@...hat.com>
>>>>> ---
>>>>>
>>>>>  Documentation/slow-work.txt |    6 +++++-
>>>>>  include/linux/slow-work.h   |    3 +++
>>>>>  kernel/slow-work.c          |   20 +++++++++++++++++++-
>>>>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/slow-work.txt b/Documentation/slow-work.txt
>>>>> index ebc50f8..2a38878 100644
>>>>> --- a/Documentation/slow-work.txt
>>>>> +++ b/Documentation/slow-work.txt
>>>>> @@ -80,6 +80,7 @@ Slow work items may then be set up by:
>>>>>   (2) Declaring the operations to be used for this item:
>>>>>  
>>>>>  	struct slow_work_ops myitem_ops = {
>>>>> +	        .owner   = THIS_MODULE,
>>>>>  		.get_ref = myitem_get_ref,
>>>>>  		.put_ref = myitem_put_ref,
>>>>>  		.execute = myitem_execute,
>>>>> @@ -102,7 +103,10 @@ A suitably set up work item can then be enqueued for processing:
>>>>>  	int ret = slow_work_enqueue(&myitem);
>>>>>  
>>>>>  This will return a -ve error if the thread pool is unable to gain a reference
>>>>> -on the item, 0 otherwise.
>>>>> +on the item, 0 otherwise.  Loadable modules may only enqueue work if at least
>>>>> +one reference to the module is known to be held.  The slow-work infrastructure
>>>>> +will acquire a reference to the module and hold it until after the item's
>>>>> +reference is dropped, assuring the stability of the callback.
>>>>>  
>>>>>  
>>>>>  The items are reference counted, so there ought to be no need for a flush
>>>>> diff --git a/include/linux/slow-work.h b/include/linux/slow-work.h
>>>>> index b65c888..1382918 100644
>>>>> --- a/include/linux/slow-work.h
>>>>> +++ b/include/linux/slow-work.h
>>>>> @@ -17,6 +17,7 @@
>>>>>  #ifdef CONFIG_SLOW_WORK
>>>>>  
>>>>>  #include <linux/sysctl.h>
>>>>> +#include <linux/module.h>
>>>>>  
>>>>>  struct slow_work;
>>>>>  
>>>>> @@ -24,6 +25,8 @@ struct slow_work;
>>>>>   * The operations used to support slow work items
>>>>>   */
>>>>>  struct slow_work_ops {
>>>>> +	struct module *owner;
>>>>> +
>>>>>  	/* get a ref on a work item
>>>>>  	 * - return 0 if successful, -ve if not
>>>>>  	 */
>>>>> diff --git a/kernel/slow-work.c b/kernel/slow-work.c
>>>>> index 09d7519..18dee34 100644
>>>>> --- a/kernel/slow-work.c
>>>>> +++ b/kernel/slow-work.c
>>>>> @@ -145,6 +145,15 @@ static unsigned slow_work_calc_vsmax(void)
>>>>>  	return min(vsmax, slow_work_max_threads - 1);
>>>>>  }
>>>>>  
>>>>> +static void slow_work_put(struct slow_work *work)
>>>>> +{
>>>>> +	/* cache values that are needed during/after pointer invalidation */
>>>>> +	struct module *owner = work->ops->owner;
>>>>> +
>>>>> +	work->ops->put_ref(work);
>>>>> +	module_put(owner);
>>>>> +}
>>>>> +
>>>>>  /*
>>>>>   * Attempt to execute stuff queued on a slow thread.  Return true if we managed
>>>>>   * it, false if there was nothing to do.
>>>>> @@ -219,7 +228,7 @@ static bool slow_work_execute(void)
>>>>>  		spin_unlock_irq(&slow_work_queue_lock);
>>>>>  	}
>>>>>  
>>>>> -	work->ops->put_ref(work);
>>>>> +	slow_work_put(work);
>>>>>  	return true;
>>>>>  
>>>>>  auto_requeue:
>>>>> @@ -299,6 +308,14 @@ int slow_work_enqueue(struct slow_work *work)
>>>>>  		if (test_bit(SLOW_WORK_EXECUTING, &work->flags)) {
>>>>>  			set_bit(SLOW_WORK_ENQ_DEFERRED, &work->flags);
>>>>>  		} else {
>>>>> +			/*
>>>>> +			 * Callers must ensure that their module has at least
>>>>> +			 * one reference held while the work is enqueued.  We
>>>>> +			 * will acquire another reference here and drop it
>>>>> +			 * once we do the last ops->put_ref()
>>>>> +			 */
>>>>> +			__module_get(work->ops->owner);
>>>>> +
>>>>>  			if (work->ops->get_ref(work) < 0)
>>>>>  				goto cant_get_ref;
>>>>>  			if (test_bit(SLOW_WORK_VERY_SLOW, &work->flags))
>>>>> @@ -313,6 +330,7 @@ int slow_work_enqueue(struct slow_work *work)
>>>>>  	return 0;
>>>>>  
>>>>>  cant_get_ref:
>>>>> +	module_put(work->ops->owner);
>>>>>  	spin_unlock_irqrestore(&slow_work_queue_lock, flags);
>>>>>  	return -EAGAIN;
>>>>>  }
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>> the body of a message to majordomo@...r.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>   
>>>>>       
>>>>>           
>>>>     
>>>>         
>>>   
>>>       
>>     
>
>
>   



Download attachment "signature.asc" of type "application/pgp-signature" (267 bytes)