lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: 6 Feb 2007 14:09:11 -0500 From: linux@...izon.com To: akepner@....com, linux@...izon.com Cc: davem@...emloft.net, netdev@...r.kernel.org Subject: Re: [RFC/TOY]Extensible hashing and RCU > For purposes of discussion, I've attached a "toy" implementation > for doing dynamic resizing of a hashtable. It is useless, except > as a proof of concept. > > I think this is very similar to what you are describing, no? Er... not quite; that has a lot more overhead than what I was thinking about. I have used the trick of distinguishable sentinel values in a doubly-linked list to maintain read cursors while it's being updated, but I don't think that's necessary here. (You can also encode the nh_type flag in the lsbit of the pointer if you're being sneaky. That will attract curses from the memleak detector, though.) In particular, I was imagining a singly linked list. To delete an item, use the hash to find the head pointer and walk it to find the pointer to be fixed up. Since the chains are short anyway, this is entirely reasonable. Less fundamental comments include: 1) Is the seqlock in get_nh() and nh_replace() really required? Is there any architecture that doesn't have atomic pointer stores? If you wanted to store the table size in a fixed location as well, I could see the need... 2) I think the whole __nh_sort_chain business will royally confuse anyone walking the chain while it happens. This is exactly what I was working to avoid. The partial sorting in __nh_insert isn't good enough. Instead, try: /* Return true if bitrev(x) > bitrev(y) */ static bool bitrev_gt(unsigned long x, unsinged long y) { /* Identify the bits that differ between x and y */ unsigned long mask = x ^ y; /* Find the bits that differ */ mask ^= mask-1; /* Find lsbit of difference (and all lower bits) */ return (x & mask) > (y & mask); } static void __nh_insert(struct nh_entry *entry, struct nh_head *head) { struct list_head *p, *n; unsigned long const hashval = nh_hashval(entry->data); /* * Insert the new entry just before the first element of the list * that its hash value is not greater than (bit-reversed). */ p = &head->list; list_for_each_rcu(n, &head->list) { struct nh_entry *t = container_of(n, struct nh_entry, nh_list); if (t->nh_type == NH_ENTRY && !bitrev_gt(hashval, nh_hashval(t->data))) break; p = n; } __list_add_rcu(&entry->nh_list, p, n); } static int nh_add(unsigned long data) { struct nh_entry *entry = kmalloc(sizeof *entry, GFP_KERNEL); struct nh *nh; if (!entry) return -ENOMEM; entry->nh_type = NH_ENTRY; entry->data = data; rcu_read_lock(); nh = get_nh(); /* or nh = __nh */ if (nh) { struct nh_head *h = \ &nh->hash[ nh_bucket(nh_hashval(data), nh->nentries) ]; spin_lock(&h->lock); __nh_insert(entry, h); spin_unlock(&h->lock); } rcu_read_unlock(); } Then there's no need for __nh_sort_chain at all. Alternatively, if the upper bits of nh_hashval are as good as the lower bits, just index the hash table on them. 3) Code inside a mutex like nh_resize() can use plain list_for_each(); the _rcu variant is only required if there can be simultaneous mutation. That's a nice module framework. I'll see if I can write some code of the sort I was thinking about. FWIW, I figured out a way around the need to delay deletion for two RCU intervals. Suppose that an expansion is pending, and we have just stretched the table from 16 entries to 32, and the following hash values are stored. (Note the bit-reversed order.) old[3] --\ new[3] ---+-> 0x03 -> 0x43 -> 0x23 -> 0x63 -> 0x13 -> 0x53 -> 0x33 -> 0x73 / new[3+16]-----------/ After an RCU period, you can throw away the old[] array and NUL-terminate the new[i] list after 0x63. But until then, you need to leave the list alone to accomodate people who are looking for 0x53 via the old head. The tricky case comes when you delete 0x13. If you only delete it from the new[3+16] list, you can't discard it until the RCU quiescent period after the one which dicsonnects the 0x63->0x13 link. The solution is actually very simple: notice when you're - Deleting the first entry in a list - While an expension is pending - And the list is in the second half of the expanded table then unlink the entry from BOTH the new head and the old list. It's a bit more work, and requires some lock-ordering care, but it lets you queue the node for RCU cleanup the normal way. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists