[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <542EA009.4060009@6wind.com>
Date: Fri, 03 Oct 2014 15:09:29 +0200
From: Nicolas Dichtel <nicolas.dichtel@...nd.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
davem@...emloft.net, akpm@...ux-foundation.org,
adobriyan@...il.com, rui.xiang@...wei.com, viro@...iv.linux.org.uk,
oleg@...hat.com, gorcunov@...nvz.org,
kirill.shutemov@...ux.intel.com, grant.likely@...retlab.ca,
tytso@....edu, Thierry Herbelot <thierry.herbelot@...nd.com>
Subject: Re: [RFC PATCH linux 2/2] fs/proc: use a hash table for the directory
entries
Le 02/10/2014 20:01, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@...nd.com> writes:
>
>> From: Thierry Herbelot <thierry.herbelot@...nd.com>
>>
>> The current implementation for the directories in /proc is using a single
>> linked list. This is slow when handling directories with large numbers of
>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>
>> This patch enables multiple linked lists. A hash based on the entry name is
>> used to select the linked list for one given entry.
>>
>> The speed creation of netdevices is faster as shorter linked lists must be
>> scanned when adding a new netdevice.
>
> Is the directory of primary concern /proc/net/dev/snmp6 ?
Yes.
>
> Unless I have configured my networking stack weird by mistake that
> is the only directory under /proc/net that grows when we add an
> interface.
>
> I just want to make certain I am seeing the same things that you are
> seeing.
>
> I feel silly for overlooking this directory when the rest of the
> scalability work was done.
>
>> Here are some numbers:
>>
>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>
>> Before the patch:
>> time ip -b dummy30000.batch
>> real 2m32.221s
>> user 0m0.380s
>> sys 2m30.610s
>>
>> After the patch:
>> time ip -b dummy30000.batch
>> real 1m57.190s
>> user 0m0.350s
>> sys 1m56.120s
>>
>> The single 'subdir' list head is replaced by a subdir hash table. The subdir
>> hash buckets are only allocated for directories. The number of hash buckets
>> is a compile-time parameter.
>
> That looks like a nice speed up. A couple of things.
>
> With sysfs and sysctl when faced this class of challenge we used an
> rbtree instead of a hash table. That should use less memory and scale
> better.
>
> I am concerned about a fixed sized hash table moving the location where
> we fall off a cliff but not removing the cliff itself.
>
> I suppose it would be possible to use the new fancy resizable hash
> tables but previous work on sysctl and sysfs suggests that we don't look
> up these entries sufficiently to require a hash table. We just need a
> data structure that doesn't fall over at scale, and the rbtrees seem to
> do that very nicely.
Ok, I will have a look at it.
Thank you,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists