linux-kernel - Re: [PATCH 4/4] fdtable: Implement new pagesize-based fdtable allocation scheme.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200610021052.35731.dada1@cosmosbay.com>
Date:	Mon, 2 Oct 2006 10:52:34 +0200
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Vadim Lobanov <vlobanov@...akeasy.net>
Cc:	akpm@...l.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] fdtable: Implement new pagesize-based fdtable allocation scheme.

On Sunday 01 October 2006 23:14, Vadim Lobanov wrote:
> This patch provides an improved fdtable allocation scheme, useful for
> expanding fdtable file descriptor entries. The main focus is on the
> fdarray, as its memory usage grows 128 times faster than that of an fdset.
>
> The allocation algorithm sizes the fdarray in such a way that its memory
> usage increases in easy page-sized chunks. Additionally, it tries to
> account for the optimal usage of the allocators involved: kmalloc() for
> sizes less than a page, and vmalloc() with page granularity for sizes
> greater than a page. Namely, the following sizes for the fdarray are
> considered, and the smallest that accommodates the requested fd count is
> chosen:
>     pagesize / 4
>     pagesize / 2
>     pagesize      <- memory allocator switch point
>     pagesize * 2
>     pagesize * 3
>     pagesize * 4
>     ...etc...
> Unlike the current implementation, this allocation scheme does not require
> a loop to compute the optimal fdarray size, and can be done in straightline
> code.
>
> Furthermore, since the fdarray overflows the pagesize boundary long before
> any of the fdsets do, it makes sense to optimize run-time by allocating
> both fdsets
> in a single swoop. Even together, they will still be, by far, smaller than
> the fdarray.
>
> As long as we're replacing the guts of fs/file.c, it makes sense to tidy up
> the code. This work includes:
>     simplification via refactoring,
>     elimination of unnecessary code, and
>     extensive commenting throughout the entire file.
> This is the last patch in the series. All the code should now be sparkly
> clean.
>

Vadim, I think your patch is way too complex, and some changes are dubious.
You mix cleanups and changes in the same patch, making hard to match your 
patch description and its content.

Current scheme is to allocate power of two sizes, and not 'the smallest that 
accommodates the requested fd count'. This is for a good reason, because we 
don't want to call vmalloc()/vfree() each time a process opens 512 or 1024 
more files (x86_64 or ia32)

I  personally prefer that table grows by a two factor, especially when they 
are huge. Also, power of two sizes gives less vmalloc space fragmentation 
(might be a concern for some people that are LOWMEM tight and that reduce 
VMALLOC space to get more LOWMEM)
default __VMALLOC_RESERVE on i386 is 128Mo, but I have some servers where I 
use
vmalloc=16M just to give more LOWMEM for kernel use.

diff -Npru old/include/linux/file.h new/include/linux/file.h
--- old/include/linux/file.h    2006-09-28 20:13:13.000000000 -0700
+++ new/include/linux/file.h    2006-09-28 20:22:05.000000000 -0700
@@ -29,8 +29,8 @@ struct embedded_fd_set {
 struct fdtable {
        unsigned int max_fds;
        struct file ** fd;      /* current fd array */
-       fd_set *close_on_exec;
        fd_set *open_fds;
+       fd_set *close_on_exec;
        struct rcu_head rcu;
        struct fdtable *next;
 };

Whats the reason for moving this close_on_exec definition in struct fdtable ?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/