[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150119090121.GG9719@linux.vnet.ibm.com>
Date: Mon, 19 Jan 2015 01:01:21 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
Cc: Patrick McHardy <kaber@...sh.net>, Thomas Graf <tgraf@...g.ch>,
David Laight <David.Laight@...LAB.COM>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"john.r.fastabend@...el.com" <john.r.fastabend@...el.com>,
"josh@...htriplett.org" <josh@...htriplett.org>,
"netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>
Subject: Re: [PATCH 7/9] rhashtable: Per bucket locks & deferred
expansion/shrinking
On Fri, Jan 16, 2015 at 09:46:44PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Jan 16, 2015 at 07:35:57PM +0000, Patrick McHardy wrote:
> > On 16.01, Thomas Graf wrote:
> > > On 01/16/15 at 06:36pm, Patrick McHardy wrote:
> > > > On 16.01, Thomas Graf wrote:
> > > > > On 01/16/15 at 04:43pm, David Laight wrote:
> > > > > > The walker is unlikely to see items that get inserted early in the hash
> > > > > > table even without a resize.
> > > > >
> > > > > I don't follow, you have to explain this statement.
> > > > >
> > > > > Walkers which don't want to see duplicates or miss entries should
> > > > > just take the mutex.
> > > >
> > > > Well, we do have a problem with interrupted dumps. As you know once
> > > > the netlink message buffer is full, we return to userspace and
> > > > continue dumping during the next read. Expanding obviously changes
> > > > the order since we rehash from bucket N to N and 2N, so this will
> > > > indeed cause duplicate (doesn't matter) and missed entries.
> > >
> > > Right,but that's a Netlink dump issue and not specific to rhashtable.
> >
> > Well, rhashtable (or generally resizing) will make it a lot worse.
> > Usually we at worst miss entries which were added during the dump,
> > which is made up by the notifications.
> >
> > With resizing we might miss anything, its completely undeterministic.
> >
> > > Putting the sequence number check in place should be sufficient
> > > for sets, right?
> >
> > I don't see how. The problem is that the ordering of the hash changes
> > and it will skip different entries than those that have already been
> > dumped.
>
> I think the generation counter should catch up this sort of problems.
> The resizing is triggered by a new/deletion element, which bumps it
> once the transaction is handled.
One unconventional way of handling this is to associate the scan with
a one-to-one resize operation. This can be implemented to have the
effect of taking a snapshot of the table.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists