lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 23 Feb 2015 15:06:19 -0800 From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> To: David Miller <davem@...emloft.net> Cc: tgraf@...g.ch, josh@...htriplett.org, alexei.starovoitov@...il.com, herbert@...dor.apana.org.au, kaber@...sh.net, ying.xue@...driver.com, netdev@...r.kernel.org, netfilter-devel@...r.kernel.org Subject: Re: Ottawa and slow hash-table resize On Mon, Feb 23, 2015 at 05:32:52PM -0500, David Miller wrote: > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> > Date: Mon, 23 Feb 2015 13:52:49 -0800 > > > On Mon, Feb 23, 2015 at 09:03:58PM +0000, Thomas Graf wrote: > >> On 02/23/15 at 11:12am, josh@...htriplett.org wrote: > >> > In theory, resizes should only take the locks for the buckets they're > >> > currently unzipping, and adds should take those same locks. Neither one > >> > should take a whole-table lock, other than resize excluding concurrent > >> > resizes. Is that still insufficient? > >> > >> Correct, this is what happens. The problem is basically that > >> if we insert from atomic context we cannot slow down inserts > >> and the table may not grow quickly enough. > >> > >> > Yeah, the add/remove statistics used for tracking would need some > >> > special handling to avoid being a table-wide bottleneck. > >> > >> Daniel is working on a patch to do per-cpu element counting > >> with a batched update cycle. > > > > One approach is simply to count only when a resize operation is in > > flight. Another is to keep a per-bucket count, which can be summed > > at the beginning of the next resize operation. > > I think we should think exactly about what we should do when someone > loops non-stop adding 1 million entries to the hash table and the > initial table size is very small. > > This is a common use case for at least one of the current rhashtable > users (nft_hash). When you load an nftables rule with a large set > of IP addresses attached, this is what happens. > > Yes I understand that nftables could give a hint and start with a > larger hash size from the start when it knows this is going to happen, > but I still believe that we should behave reasonably when starting > from a small table. > > I'd say that with the way things work right now, in this situation it > actually hurts to allow asynchronous inserts during a resize. Because > we end up with extremely long hash table chains, and thus make the > resize work and the lookups both take an excruciatingly long amount of > time to complete. > > I just did a quick scan of all code paths that do inserts into an > rhashtable, and it seems like all of them can easily block. So why > don't we do that? Make inserts sleep on an rhashtable expansion > waitq. > > There could even be a counter of pending inserts, so the expander can > decide to expand further before waking the inserting threads up. Should be reasonably simple, and certainly seems worth a try! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists