[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130405155534.GC21852@quack.suse.cz>
Date: Fri, 5 Apr 2013 17:55:34 +0200
From: Jan Kara <jack@...e.cz>
To: Ramkumar Ramachandra <artagnon@...il.com>
Cc: linux-kernel@...r.kernel.org, Junio C Hamano <gitster@...ox.com>,
Thomas Rast <trast@....ethz.ch>,
Duy Nguyễn <pclouds@...il.com>,
Jeff King <peff@...f.net>,
Karsten Blees <karsten.blees@...il.com>
Subject: Re: Beyond inotify recursive watches
Hi,
On Mon 18-03-13 16:18:11, Ramkumar Ramachandra wrote:
> We, the Git folks, were wondering how to speed things up. In an
> strace of "git status" on linux-2.6.git, we found:
>
> top syscalls sorted top syscalls sorted
> by acc. time by number
> ----------------------------------------------
> 0.401906 40950 lstat 0.401906 40950 lstat
> 0.190484 5343 getdents 0.150055 5374 open
> 0.150055 5374 open 0.190484 5343 getdents
> 0.074843 2806 close 0.074843 2806 close
> 0.003216 157 read 0.003216 157 read
>
> Most of this happens when we try to build the index, querying for
> changes in tracked files and discovering untracked files. It was
> suggested that we can use inotify to speed things up: we'll write a
> user-wide daemon (like ssh_client) that will set up watches on each
> directory of each git repository. A repository-wide daemon wouldn't
> work because /proc/sys/fs/inotify/max_user_instances reads 128 on
> typical linux-3.8 systems, and this is problematic.
>
> However, Karsten and Junio point out that our efforts might be futile
> as we are trying to do what the VFS caching already does, and doing it
> poorly. Speedups, if any, would be minor and certainly not worth the
> effort.
>
> I think inotify is a poorly suited solution for our needs, as setting
> up recursive watches is horribly inelegant. I think it's a
> well-suited solution for something like Dropbox, which just executes
> something when there's a change in a specified directory. Also, I
> suspect VFS caching works by optimizing filesystem calls for
> frequently used directory entries. A git repository is not a
> collection of frequently-used directory entries, but a frequently used
> unit. I know very little about how VFS works, but I'm wondering if we
> can make any changes in VFS to make it perform better with git
> repositories. We won't need something as fine-grained as inotify: if
> the tree hash of a directory entry changes frequently enough, optimize
> all filesystem calls to inodes in the directory recursively.
> Recursively optimizing a directory is useless in the general case, and
> I would imagine something like a new rwatch() syscall for git to
> register the repository with VFS. All system calls will then be
> magically optimized, and few changes need to be made to git. The
> added side-benefit is that all other version control systems can use
> it too.
Hum, I have somewhat hard time to understand what do you mean by
'magically optimized syscalls'. What should happen in VFS to speedup your
load?
What your question reminds me is an idea of recursive modification time
stamp on directories. That is a time stamp that gets updated whenever
anything in the tree under the directory changes. Now this would be too
expensive to maintain so there's also a trick implemented that you update
the time stamp (and continue updating recursive time stamps upwards) only
if a special flag is set on the directory. And you clear the flag at that
moment. So until someone checks the time stamp and resets the flag no
further updates of the recursive modification time happen.
This scheme works for arbitrary number of processes interested in recursive
time stamps (only updates of the time stamps get more frequent). What is
somewhat inconvenient is that this only tells you something in the
directory or its subtree changed so you still have to scan all the
directories on the path to modified file. So I'm not sure of how much use
this would be to you.
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists