netdev - Re: Ottawa and slow hash-table resize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150224175014.GA29802@casper.infradead.org>
Date:	Tue, 24 Feb 2015 17:50:14 +0000
From:	Thomas Graf <tgraf@...g.ch>
To:	David Miller <davem@...emloft.net>
Cc:	kaber@...sh.net, paulmck@...ux.vnet.ibm.com, josh@...htriplett.org,
	alexei.starovoitov@...il.com, herbert@...dor.apana.org.au,
	ying.xue@...driver.com, netdev@...r.kernel.org,
	netfilter-devel@...r.kernel.org
Subject: Re: Ottawa and slow hash-table resize

On 02/24/15 at 12:09pm, David Miller wrote:
> And having a flood of 1 million new TCP connections all at once
> shouldn't knock us over.
> 
> Therefore, we will need to find a way to handle this problem without
> being able to block on insert.

One possible way to handle this is to have users like TCP grow
quicker than 2x. Maybe start with 16x and grow slower and slower
using a log function. (No, we do not want rhashtable congestion
control algos ;-)

> Thinking about this, if inserts occur during a pending resize, if the
> nelems of the table has exceeded even the grow threshold for the new
> table, it makes no sense to allow these async inserts as they are
> going to make the resize take longer and prolong the pain.

Let's say we start with an initial table size of 16K (we can make
this system memory depenend) and we grow by 8x. New inserts go
into the new table immediately so as soon as we have 12K entries
we'll grow right to 128K buckets. As we grow above 75K we'll start
growing to 1024K buckets. New entries already go to the 1024K
buckets at this point given that the first grow cycle should be
fast. The 2nd grow cycle would take an est 6 RCU grace periods.
This would also still give us a max of 8K bucket locks which
should be good enough as well.

Just thinking this out loud. Still working on this.

> On one hand I like the async resize because it means that an insert
> that triggers the resize doesn't incur a huge latency spike since
> it was simply unlucky to be the resize trigger event.  The async
> resize smoothes out the cost of the resize across the system.
> 
> This scheme works really well if, on average, the resize operation
> completes before enough subsequent inserts occur to exceed even
> the resized tables resize threshold.
> 
> So I think what I'm getting at is that we can allow parallel inserts
> but only up until the point where the resized tables thresholds are
> exceeded.
> 
> Looking at how to implement this, I think that there is too much
> configurability to this code.  There is no reason to have indirect
> calls for the grow decision.  This should be a quick test, but it's
> not because we go through ->grow_decision.  It should just be
> rht_grow_above_75 or whatever, and inline this crap!
> 
> Nobody even uses this indirection capability, it's therefore over
> engineered :-)

Another option is to only call the grow_decision once every N inserts
or removals (32? 64?) and handle updates as batches. No objection
to ditching the grow/shrink function for now though. Not sure we
anyone actually needs different growth semantics.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html