lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Jul 2014 23:27:19 -0600
From:	Andreas Dilger <adilger@...ger.ca>
To:	Abhi Das <adas@...hat.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"cluster-devel@...hat.com" <cluster-devel@...hat.com>
Subject: Re: [RFC PATCH 0/2] dirreadahead system call

Is there a time when this doesn't get called to prefetch entries in
readdir() order?  It isn't clear to me what benefit there is of returning
the entries to userspace instead of just doing the statahead implicitly
in the kernel?

The Lustre client has had what we call "statahead" for a while,
and similar to regular file readahead it detects the sequential access
pattern for readdir() + stat() in readdir() order (taking into account if ".*"
entries are being processed or not) and starts fetching the inode
attributes asynchronously with a worker thread.

This syscall might be more useful if userspace called readdir() to get
the dirents and then passed the kernel the list of inode numbers
to prefetch before starting on the stat() calls. That way, userspace
could generate an arbitrary list of inodes (e.g. names matching a
regexp) and the kernel doesn't need to guess if every inode is needed. 

As it stands, this syscall doesn't help in anything other than readdir
order (or of the directory is small enough to be handled in one
syscall), which could be handled by the kernel internally already,
and it may fetch a considerable number of extra inodes from
disk if not every inode needs to be touched. 

Cheers, Andreas

> On Jul 25, 2014, at 11:37, Abhi Das <adas@...hat.com> wrote:
> 
> This system call takes 3 arguments:
> fd      - file descriptor of the directory being readahead
> *offset - offset in dir from which to resume. This is updated
>          as we move along in the directory
> count   - The max number of entries to readahead
> 
> The syscall is supposed to read upto 'count' entries starting at
> '*offset' and cache the inodes corresponding to those entries. It
> returns a negative error code or a positive number indicating
> the number of inodes it has issued readaheads for. It also
> updates the '*offset' value so that repeated calls to dirreadahead
> can resume at the right location. Returns 0 when there are no more
> entries left.
> 
> Abhi Das (2):
>  fs: Add dirreadahead syscall and VFS hooks
>  gfs2: GFS2's implementation of the dir_readahead file operation
> 
> arch/x86/syscalls/syscall_32.tbl |   1 +
> arch/x86/syscalls/syscall_64.tbl |   1 +
> fs/gfs2/Makefile                 |   3 +-
> fs/gfs2/dir.c                    |  49 ++++++---
> fs/gfs2/dir.h                    |  15 +++
> fs/gfs2/dir_readahead.c          | 209 +++++++++++++++++++++++++++++++++++++++
> fs/gfs2/file.c                   |   2 +
> fs/gfs2/main.c                   |  10 +-
> fs/gfs2/super.c                  |   1 +
> fs/readdir.c                     |  49 +++++++++
> include/linux/fs.h               |   3 +
> 11 files changed, 328 insertions(+), 15 deletions(-)
> create mode 100644 fs/gfs2/dir_readahead.c
> 
> -- 
> 1.8.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ