[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5584B62F.5080506@sr71.net>
Date: Fri, 19 Jun 2015 17:39:11 -0700
From: Dave Hansen <dave@...1.net>
To: Andi Kleen <ak@...ux.intel.com>
CC: dave.hansen@...ux.intel.com, akpm@...ux-foundation.org,
jack@...e.cz, viro@...iv.linux.org.uk, eparis@...hat.com,
john@...nmccutchan.com, rlove@...ve.org,
tim.c.chen@...ux.intel.com, linux-kernel@...r.kernel.org,
paulmck@...ux.vnet.ibm.com
Subject: Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched
files
On 06/19/2015 04:33 PM, Andi Kleen wrote:
>> > I *think* we can avoid taking the srcu_read_lock() for the
>> > common case where there are no actual marks on the file
>> > being modified *or* the vfsmount.
> What is so expensive in it? Just the memory barrier in it?
The profiling doesn't hit on the mfence directly, but I assume that the
overhead is coming from there. The "mov 0x8(%rdi),%rcx" is identical
before and after the barrier, but it appears much more expensive
_after_. That makes no sense unless the barrier is the thing causing it.
Here's how the annotation mode of 'perf top' breaks it down:
> │ ffffffff810fb480 <load0>:
> │ nop
> │ mov (%rdi),%rax
> 0.58 │ push %rbp
> │ incl %gs:0x7ef0f488(%rip)
> 1.73 │ mov %rsp,%rbp
> │ and $0x1,%eax
> │ movslq %eax,%rdx
> 0.58 │ mov 0x8(%rdi),%rcx
> │ incq %gs:(%rcx,%rdx,8)
> │ mfence
> 69.94 │ add $0x2,%rdx
> 7.51 │ mov 0x8(%rdi),%rcx
> 4.05 │ incq %gs:(%rcx,%rdx,8)
> 13.87 │ decl %gs:0x7ef0f45f(%rip)
> │ pop %rbp
> 1.73 │ ← retq
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists