linux-kernel - Re: [patch] epoll use a single inode ...

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200703071906.58405.dada1@cosmosbay.com>
Date:	Wed, 7 Mar 2007 19:06:57 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Davide Libenzi <davidel@...ilserver.org>,
	Avi Kivity <avi@...o.co.il>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [patch] epoll use a single inode ...

On Wednesday 07 March 2007 18:45, Linus Torvalds wrote:
> On Wed, 7 Mar 2007, Eric Dumazet wrote:
> > sockets already uses file->private_data.
> >
> > But calls to read()/write() (not send()/recv()) still need to go through
> > the dentry, before entering socket land.
>
> Sure. The dentry and the inode need to *exist*, but they can be one single
> static dentry/inode per "file descriptor type".
>
> We always pass in the "struct file *" to read/write too, since we need it
> anyway for things like file control information (eg "is it a nonblocking
> read or write" kinds of things).
>
> So I'm not suggesting a NULL dentry/inode, I'm suggesting a single static
> one per type.
>
> And yeah, it may be harder than it looks. Some things "know" that all the
> relevant info is in the inode, so they just pass in the inode. In the pipe
> layer, for example, you'd need to change free_pipe_info() and
> alloc_pipe_info() to pass in the file descriptor instead, same goes for
> pipe_release(). But the "struct file *" is always available, it's just
> that since the code was originally written to have all the info in the
> inode, some of the code isn't set up to use it or pass it on..
>
> But your patch is independent of that, and looks fine. Except I don't like
> this part:
>
> -       file->f_path.mnt = mntget(sock_mnt);
> +       file->f_path.mnt = NULL;
>
> since I'd be much happer with always having f_path.mnt available, the same
> way we should always have f_path.dentry there.

Yes, but mntget()/mntput() are protected against NULL.
I was quite happy to remove two locked operations :)
I didnt found a way to crash (yet) my patched machine :)

>
> (Btw, your patch is *not* going to work with the file->f_private_data
> approach, because d_path() is not passed down the "file *" thing. So we'd
> need to do that, and that's more intrusive (it can be NULL, since for
> things like cwd/pwd we don't have a "struct file").

I tried this path today and failed...
Too many changes to do (nameidata) to propagate a 'struct file *' 
appropriately...

>
> But I like your patch as a totally independent thing. "It just makes
> sense".
>
> (Apart from the f_path.mnt thing, which I think was something else ;)

OK no problem here is the patch without messing f_path.mnt 

(benchmark results not really different on my little machine, SMP kernel but 
one CPU only... maybe because lock suffix is changed by a nop)


[PATCH] Delay the dentry name generation on sockets and pipes.

1) Introduces a new method in 'struct dentry_operations'. This method called 
d_dname() might be called from d_path() to be able to provide a dentry name 
for special filesystems. It is called without locks.

Future patches (if we succeed in having one common dentry for all pipes) may 
need to change prototype of this method, but we now use :
char *d_dname(struct dentry *dentry, char *buffer, int buflen)


2) Use this new method for sockets : No more sprintf() at socket creation. 
This is delayed up to the moment someone does an access to /proc/pid/fd/...

3) Use this new method for pipes : No more sprintf() at pipe creation. This is 
delayed up to the moment someone does an access to /proc/pid/fd/...

A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a 
*nice* speedup on my Pentium(M) 1.6 Ghz :

3.090 s instead of 3.450 s

Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
 fs/dcache.c            |    3 +++
 fs/pipe.c              |   12 +++++++++---
 include/linux/dcache.h |    1 +
 net/socket.c           |   13 ++++++++++---
 4 files changed, 23 insertions(+), 6 deletions(-)

View attachment "introduce_d_dname.patch" of type "text/plain" (2902 bytes)