[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160414142059.GF22053@quack2.suse.cz>
Date: Thu, 14 Apr 2016 16:20:59 +0200
From: Jan Kara <jack@...e.cz>
To: Waiman Long <Waiman.Long@....com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.com>,
Jeff Layton <jlayton@...chiereds.net>,
"J. Bruce Fields" <bfields@...ldses.org>,
Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Andi Kleen <andi@...stfloor.org>,
Dave Chinner <dchinner@...hat.com>,
Boqun Feng <boqun.feng@...il.com>,
Scott J Norton <scott.norton@....com>,
Douglas Hatch <doug.hatch@....com>
Subject: Re: [PATCH v7 4/4] vfs: Use per-cpu list for superblock's inode list
On Tue 12-04-16 18:54:46, Waiman Long wrote:
> When many threads are trying to add or delete inode to or from
> a superblock's s_inodes list, spinlock contention on the list can
> become a performance bottleneck.
>
> This patch changes the s_inodes field to become a per-cpu list with
> per-cpu spinlocks. As a result, the following superblock inode list
> (sb->s_inodes) iteration functions in vfs are also being modified:
>
> 1. iterate_bdevs()
> 2. drop_pagecache_sb()
> 3. wait_sb_inodes()
> 4. evict_inodes()
> 5. invalidate_inodes()
> 6. fsnotify_unmount_inodes()
> 7. add_dquot_ref()
> 8. remove_dquot_ref()
>
> With an exit microbenchmark that creates a large number of threads,
> attachs many inodes to them and then exits. The runtimes of that
> microbenchmark with 1000 threads before and after the patch on a
> 4-socket Intel E7-4820 v3 system (40 cores, 80 threads) were as
> follows:
>
> Kernel Elapsed Time System Time
> ------ ------------ -----------
> Vanilla 4.5-rc4 65.29s 82m14s
> Patched 4.5-rc4 22.81s 23m03s
>
> Before the patch, spinlock contention at the inode_sb_list_add()
> function at the startup phase and the inode_sb_list_del() function at
> the exit phase were about 79% and 93% of total CPU time respectively
> (as measured by perf). After the patch, the percpu_list_add()
> function consumed only about 0.04% of CPU time at startup phase. The
> percpu_list_del() function consumed about 0.4% of CPU time at exit
> phase. There were still some spinlock contention, but they happened
> elsewhere.
>
> Signed-off-by: Waiman Long <Waiman.Long@....com>
The patch looks good to me. You can add:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists