lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080917213748.7571fbdf.akpm@linux-foundation.org>
Date:	Wed, 17 Sep 2008 21:37:48 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Lai Jiangshan <laijs@...fujitsu.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Dipankar Sarma <dipankar@...ibm.com>,
	Peter Zijlstra <peterz@...radead.org>, manfred@...orfullife.com
Subject: Re: [RFC PATCH] rcu: introduce kfree_rcu()

On Thu, 18 Sep 2008 12:18:28 +0800 Lai Jiangshan <laijs@...fujitsu.com> wrote:

> 
> sometimes a rcu callback is just calling kfree() to free a struct's memory
> (we say this callback is a trivial callback.).
> this patch introduce kfree_rcu() to do these things directly, easily.
> 
> There are 4 reasons that we need kfree_rcu():
> 
> 1) unloadable modules:
>    a module(rcu callback is defined in this module) using rcu must
>    call rcu_barrier() when unload. rcu_barrier() will increase
>    the system's overhead(the more cpus the worse) and
>    rcu_barrier() is very time-consuming. if all rcu callback defined
>    in this module are trivial callback, we can just call kfree_rcu()
>    instead, save a rcu_barrier() when unload.
> 
> 2) duplicate code:
>    all trivial callback are duplicate code though the structs to be freed
>    are different. it's just a container_of() and a kfree().
>    There are about 50% callbacks are trivial callbacks for call_rcu() in
>    current kernel code.
> 
> 3) cache:
>    the instructions of trivial callback is not in the cache supposedly.
>    calling a trivial callback will let to cache missing very likely.
>    the more trivial callback the more cache missing. OK, this is
>    not a problem now or in a few days: Only less than 1% trivial callback
>    are called in running kernel.
> 
> 4) future:
>    the number of user of rcu is increasing. new code for rcu is
>    trivial callback very likely. it means more modules using rcu
>    and more duplicate code(may come to 90% of callbacks is trivial
>    callbacks) and more cache missing.
> 
> Implementation:
>    there were a lot of ideas came out when i implemented kfree_rcu().
>    I chose the simplest one as this patch shows. but these implementation
>    may cannot be used for to free a struct larger than 16KBytes.
> 
> kfree_rcu_bh()? kfree_rcu_sched()?
>    these two are not need current. call_rcu_bh() & call_rcu_sched()
>    are hardly be called(and hardly be called for trivial callback).
> 
> vfree_rcu()?
>    No, vfree() is not atomic function, will not be called in softirq.
> 

This is all rather mysterious.

> ---
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e8b4039..04c654f 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -253,4 +253,25 @@ extern void rcu_barrier_sched(void);
>  extern void rcu_init(void);
>  extern int rcu_needs_cpu(int cpu);
>  
> +#define __KFREE_RCU_MAX_OFFSET 4095
> +#define KFREE_RCU_MAX_OFFSET (sizeof(void *) * __KFREE_RCU_MAX_OFFSET)
> +
> +#define __rcu_reclaim(head) \
> +do { \
> +	unsigned long __offset = (unsigned long)head->func; \
> +	if (__offset <= __KFREE_RCU_MAX_OFFSET) \
> +		kfree((void *)head - sizeof(void *) * __offset); \
> +	else \
> +		head->func(head); \
> +} while(0)

All the above could do with some comments explaining what it does.

Please use checkpatch.

Surely it cannot be resirable to have a "function" which can call kfree
either synchronously or asynchronously depending upon the size of the
object which was passed to it?  If the caller wants rcu treatment of
the freeing then that is what the caller must be given.  I mean, if the
caller can tolerate a synchrnous call to kfree() then the caller should
have directly called kfree?

I think (and pray) that the above could have been implemented as an
out-of-line C function?

> +
> +/**
> + * kfree_rcu - free previously allocated memory after a grace period.
> + * @ptr:  pointer returned by kmalloc.
> + * @head: structure to be used for queueing the RCU updates. This structure
> + *        is a part of previously allocated memory @ptr.
> + */
> +extern void kfree_rcu(const void *ptr, struct rcu_head *head);

rcu kerneldoc is implemented in the .c file, not in the .h file.

>  #endif /* __LINUX_RCUPDATE_H */
> diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
> index aad93cd..5a14190 100644
> --- a/kernel/rcuclassic.c
> +++ b/kernel/rcuclassic.c
> @@ -232,7 +232,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	while (list) {
>  		next = list->next;
>  		prefetch(next);
> -		list->func(list);
> +		__rcu_reclaim(list);
>  		list = next;
>  		if (++count >= rdp->blimit)
>  			break;
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index 467d594..aa9b56a 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -162,6 +162,18 @@ void rcu_barrier_sched(void)
>  }
>  EXPORT_SYMBOL_GPL(rcu_barrier_sched);
>  
> +void kfree_rcu(const void *ptr, struct rcu_head *head)
> +{
> +	unsigned long offset;
> +	typedef void (*rcu_callback)(struct rcu_head *);
> +
> +	offset = (void *)head - (void *)ptr;
> +	BUG_ON(offset > KFREE_RCU_MAX_OFFSET);
> +
> +	call_rcu(head, (rcu_callback)(offset / sizeof(void *)));

How can this work?  We take the difference between two pointers, divide
that by 4 or 8, then treat the resulting number as the address of an
RCU callback function.

I think I'm missing something here.

> +}
> +EXPORT_SYMBOL_GPL(kfree_rcu);
> +
>  void __init rcu_init(void)
>  {
>  	__rcu_init();
> diff --git a/kernel/rcupreempt.c b/kernel/rcupreempt.c
> index 2782793..62a9e54 100644
> --- a/kernel/rcupreempt.c
> +++ b/kernel/rcupreempt.c
> @@ -1108,7 +1108,7 @@ static void rcu_process_callbacks(struct softirq_action *unused)
>  	spin_unlock_irqrestore(&rdp->lock, flags);
>  	while (list) {
>  		next = list->next;
> -		list->func(list);
> +		__rcu_reclaim(list);
>  		list = next;
>  		RCU_TRACE_ME(rcupreempt_trace_invoke);
>  	}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ