[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150620002937.GE3913@linux.vnet.ibm.com>
Date: Fri, 19 Jun 2015 17:29:37 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Andi Kleen <ak@...ux.intel.com>
Cc: Dave Hansen <dave@...1.net>, dave.hansen@...ux.intel.com,
akpm@...ux-foundation.org, jack@...e.cz, viro@...iv.linux.org.uk,
eparis@...hat.com, john@...nmccutchan.com, rlove@...ve.org,
tim.c.chen@...ux.intel.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched
files
On Fri, Jun 19, 2015 at 04:33:06PM -0700, Andi Kleen wrote:
> On Fri, Jun 19, 2015 at 02:50:25PM -0700, Dave Hansen wrote:
> >
> > From: Dave Hansen <dave.hansen@...ux.intel.com>
> >
> > I have a _tiny_ microbenchmark that sits in a loop and writes
> > single bytes to a file. Writing one byte to a tmpfs file is
> > around 2x slower than reading one byte from a file, which is a
> > _bit_ more than I expecte. This is a dumb benchmark, but I think
> > it's hard to deny that write() is a hot path and we should avoid
> > unnecessary overhead there.
> >
> > I did a 'perf record' of 30-second samples of read and write.
> > The top item in a diffprofile is srcu_read_lock() from
> > fsnotify(). There are active inotify fd's from systemd, but
> > nothing is actually listening to the file or its part of
> > the filesystem.
> >
> > I *think* we can avoid taking the srcu_read_lock() for the
> > common case where there are no actual marks on the file
> > being modified *or* the vfsmount.
>
> What is so expensive in it? Just the memory barrier in it?
>
> Perhaps the function can be tuned in general.
The memory barrier we are pretty much stuck with unless we want
synchronize_srcu() to be quite a bit more expensive (and for SRCU to be
unusable from offline and idle) -- and that synchronize_srcu() expense
drove rewrite from the earlier version to this one. It is possible to
cut down from two to one instances of __this_cpu_inc(), however.
It of course would be possible to have two types of SRCU, one for fast
grace periods and the other for memory-barrier-free read-side critical
sections, but obviously a very clear case would need to be made for this.
At least judging from the reactions the last time I introduced a new
flavor of RCU. ;-)
So, echoing Andi, what exactly is expensive?
Thanx, Paul
> -Andi
>
> int __srcu_read_lock(struct srcu_struct *sp)
> {
> int idx;
>
> idx = ACCESS_ONCE(sp->completed) & 0x1;
> preempt_disable();
> __this_cpu_inc(sp->per_cpu_ref->c[idx]);
> smp_mb(); /* B */ /* Avoid leaking the critical section. */
> __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
> preempt_enable();
> return idx;
> }
>
>
> >
> > The *_fsnotify_mask is an aggregate of each of the masks from
> > each mark. If we have nothing set in the masks at all then there
> > are no marks and no need to do anything with 'ignored masks'
> > since none exist. This keeps us from having to do the costly
> > srcu_read_lock() for a check which is very cheap.
> >
> > This patch gave a 10.8% speedup in writes/second on my test.
> >
> > Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Jan Kara <jack@...e.cz>
> > Cc: Al Viro <viro@...iv.linux.org.uk>
> > Cc: Eric Paris <eparis@...hat.com>
> > Cc: John McCutchan <john@...nmccutchan.com>
> > Cc: Robert Love <rlove@...ve.org>
> > Cc: Tim Chen <tim.c.chen@...ux.intel.com>
> > Cc: Andi Kleen <ak@...ux.intel.com>
> > Cc: linux-kernel@...r.kernel.org
> > ---
> >
> > b/fs/notify/fsnotify.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff -puN fs/notify/fsnotify.c~optimize-fsnotify fs/notify/fsnotify.c
> > --- a/fs/notify/fsnotify.c~optimize-fsnotify 2015-06-19 13:29:53.117283581 -0700
> > +++ b/fs/notify/fsnotify.c 2015-06-19 13:29:53.123283853 -0700
> > @@ -213,6 +213,16 @@ int fsnotify(struct inode *to_tell, __u3
> > !(test_mask & to_tell->i_fsnotify_mask) &&
> > !(mnt && test_mask & mnt->mnt_fsnotify_mask))
> > return 0;
> > + /*
> > + * Optimization: The *_fsnotify_mask is an aggregate of each of the
> > + * masks from each mark. If we have nothing set in the masks at
> > + * all then there are no marks and no need to do anything with
> > + * 'ignored masks' since none exist. This keeps us from having to
> > + * do the costly srcu_read_lock() for a check which is very cheap.
> > + */
> > + if (!to_tell->i_fsnotify_mask &&
> > + (!mnt || !mnt->mnt_fsnotify_mask))
> > + return 0;
> >
> > idx = srcu_read_lock(&fsnotify_mark_srcu);
> >
> > _
>
> --
> ak@...ux.intel.com -- Speaking for myself only
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists