linux-kernel - Re: [RFC, PATCH] kernel/rcu: add kfree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090112174332.GA14675@linux.vnet.ibm.com>
Date:	Mon, 12 Jan 2009 09:43:32 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Manfred Spraul <manfred@...orfullife.com>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org
Subject: Re: [RFC, PATCH] kernel/rcu: add kfree_rcu

On Sun, Jan 04, 2009 at 12:06:58PM -0800, Paul E. McKenney wrote:
> On Sun, Jan 04, 2009 at 01:22:00PM +0100, Manfred Spraul wrote:
> > Lai Jiangshan wrote:
> >> I have not posted it. -:)
> >>   
> > Could you post it?
> >
> > Paul: What would break if we stop processing rcu entries in (cpu) order?
> 
> If I understand, you are suggesting that a given CPU process its RCU
> callbacks out of order.  This would break rcu_barrier(), so please do
> not do this.
> 
> If I misunderstood what you are suggesting, please enlighten me!

One other thing that might be really cool is for memory freed via RCU to
be treated as if it was cache-cold, which it is unless the RCU callback
needs to write to the memory block.  In the case of kfree_rcu(), the
callback should not need to do writes, so it might make sense to handle
the block differently than the typical hot-in-cache free.

							Thanx, Paul

> > The head->func(head) in rcu_do_batch() is probably a nightmare for the 
> > branch target predictor.
> >
> > What about:
> > - shrinking struct rcu_head to just a pointer (let's start with the goodie)
> > - Adding a register_rcu_callback() function.
> > It allocates the per-cpu storage for the rcu grace period lists.
> > Seperate lists for each registered callback - thus no need to copy the 
> > callback target into each rcu_head structure.
> > It returns a pointer/handle to these lists.
> > - call_rcu gets that handle instead of the plain function pointer.
> > - rcu_do_batch enumerates all registered callbacks. Thus first all 
> > callback_struct->func(head) calls for the first registered callback, then 
> > the calls for the 2nd callback, etc.
> > Better for the icache, better for the branch predictor.
> 
> Hmmm...  I guess that rcu_barrier() could put a callback on each of the
> resulting per-CPU lists for each CPU.  Making rcu_barrier() more
> expensive is probably not a problem.  But there would need to be a way
> of marking rcu_barrier()'s rcu_head structures, perhaps the bottom bit
> of the pointer (shudder!).
> 
> The rcu_offline code will of course need to traverse these lists in
> order to move the callbacks from an outgoing CPU.
> 
> It would also be necessary to inspect the current call_rcu() invocations
> in the kernel (not too big a job, as there are only about 100 of them).
> If there are any that rely on callbacks being invoked in order, these
> would need to be addressed if we are to do something like what you
> are suggesting.  I do not recall ever suggesting that people rely on
> such ordering, but given that people can read the code and see that
> rcu_barrier() already relies on it...
> 
> So if we do go this way, we will need to update the documentation.
> 
> The deep embedded guys would like a single-pointer rcu_head, and your
> approach seems better than the one I came up with a couple of years ago
> on page 11 of:
> 
> 	http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf
> 
> At least assuming that the problems can be resolved.
> 
> I don't see how this helps the icache at all, but could see how it might
> help branch prediction.
> 
> > Paul: Do you have a test case that is suitable for benchmarking rcu?
> > Any workloads were rcu appears significantly in oprofile?
> > And: Do you know how many rcu entries are typically alive? How much memory 
> > is used for the function pointers?
> 
> The test cases I know of are those used to validate the performance of
> various RCU patches, most of which have been quite insensitive to the
> update-side overhead.  The only workloads that I am aware of where RCU
> update-side processing shows up are those running on hundreds of CPUs
> (hence hierarchical RCU).  Some workloads have many thousands of RCU
> callbacks in flight -- I believe that Dipankar Sarma measured something
> like 1600 per grace period on a file-system benchmark some years back.
> 
> The amount of memory used for the function pointers can be large, though
> many cases now union this space with other storage (e.g., struct dentry).
> The deep embedded guys have worried about it in the past, though I have
> not heard much from them in the past few years -- something about even
> cellphones having hundreds of megabytes of DRAM, I guess.  ;-)
> 
> So, in short, I am not sure that this will be worth the increase in code
> complexity, but it does sound like an interesting possibility.
> 
> 							Thanx, Paul
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/