lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 30 Oct 2008 08:51:26 -0700 From: Stephen Hemminger <shemminger@...tta.com> To: Eric Dumazet <dada1@...mosbay.com> Cc: paulmck@...ux.vnet.ibm.com, Corey Minyard <minyard@....org>, David Miller <davem@...emloft.net>, benny+usenet@...rsen.dk, netdev@...r.kernel.org, Christoph Lameter <cl@...ux-foundation.org>, a.p.zijlstra@...llo.nl, johnpol@....mipt.ru, Christian Bell <christian@...i.com> Subject: Re: [PATCH] udp: Introduce special NULL pointers for hlist termination On Thu, 30 Oct 2008 16:40:01 +0100 Eric Dumazet <dada1@...mosbay.com> wrote: > Eric Dumazet a écrit : > > Paul E. McKenney a écrit : > >> On Wed, Oct 29, 2008 at 09:00:13PM +0100, Eric Dumazet wrote: > >>> Hum... Another way of handling all those cases and avoid memory barriers > >>> would be to have different "NULL" pointers. > >>> > >>> Each hash chain should have a unique "NULL" pointer (in the case of > >>> UDP, it > >>> can be the 128 values : [ (void*)0 .. (void *)127 ] > >>> > >>> Then, when performing a lookup, a reader should check the "NULL" pointer > >>> it get at the end of its lookup has is the "hash" value of its chain. > >>> > >>> If not -> restart the loop, aka "goto begin;" :) > >>> > >>> We could avoid memory barriers then. > >>> > >>> In the two cases Corey mentioned, this trick could let us avoid > >>> memory barriers. > >>> (existing one in sk_add_node_rcu(sk, &hslot->head); should be enough) > >>> > >>> What do you think ? > >> > >> Kinky!!! ;-) > >> > >> Then the rcu_dereference() would be supplying the needed memory barriers. > >> > >> Hmmm... I guess that the only confusion would be if the element got > >> removed and then added to the same list. But then if its pointer was > >> pseudo-NULL, then that would mean that all subsequent elements had been > >> removed, and all preceding ones added after the scan started. > >> > >> Which might well be harmless, but I must defer to you on this one at > >> the moment. > >> > >> If you need a larger hash table, another approach would be to set the > >> pointer's low-order bit, allowing the upper bits to be a full-sized > >> index -- or even a pointer to the list header. Just make very sure > >> to clear the pointer when freeing, or an element on the freelist > >> could end up looking like a legitimate end of list... Which again > >> might well be safe, but why inflict this on oneself? > > > > Ok, here is an updated and tested patch. > > Thanks everybody > > [PATCH] udp: Introduce special NULL pointers for hlist termination > > In order to safely detect changes in chains, we would like to have different > 'NULL' pointers. Each chain in hash table is terminated by an unique 'NULL' > value, so that the lockless readers can detect their lookups evaded from > their starting chain. > > We introduce a new type of hlist implementation, named hlist_nulls, were > we use the least significant bit of the 'ptr' to tell if its a "NULL" value > or a pointer to an object. We expect to use this new hlist variant for TCP > as well. > > For UDP/UDP-Lite hash table, we use 128 different "NULL" values, > (UDP_HTABLE_SIZE=128) > > Using hlist_nulls saves memory barriers (a read barrier to fetch 'next' > pointers *before* checking key values) we added in commit > 96631ed16c514cf8b28fab991a076985ce378c26 > (udp: introduce sk_for_each_rcu_safenext()) > > This also saves a write memory barrier in udp_lib_get_port(), between > sk->sk_hash update and sk->next update) > > Signed-off-by: Eric Dumazet <dada1@...mosbay.com> > --- IMHO this goes over the edge into tricky hack. Is it really worth it? Is there a better simpler way? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists