lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CO1PR11MB4931D767C5277A37F24C824DE4489@CO1PR11MB4931.namprd11.prod.outlook.com>
Date:   Wed, 31 May 2023 10:31:09 +0000
From:   "Chen, Zhiyin" <zhiyin.chen@...el.com>
To:     Eric Biggers <ebiggers@...nel.org>,
        Christian Brauner <brauner@...nel.org>
CC:     "viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Zou, Nanhai" <nanhai.zou@...el.com>,
        "Feng, Xiaotian" <xiaotian.feng@...el.com>
Subject: RE: [PATCH] fs.h: Optimize file struct to prevent false sharing

As Eric said, CONFIG_RANDSTRUCT_NONE is set in the default config 
and some production environments, including Ali Cloud. Therefore, it 
is worthful to optimize the file struct layout.

Here are the syscall test results of unixbench.

Command: numactl -C 3-18 ./Run -c 16 syscall

Without patch
------------------------
224 CPUs in system; running 16 parallel copies of tests
System Call Overhead                        5611223.7 lps   (10.0 s, 7 samples)
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    5611223.7   3740.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                         3740.8

With patch
------------------------------------------------------------------------
224 CPUs in system; running 16 parallel copies of tests
System Call Overhead                        7567076.6 lps   (10.0 s, 7 samples)
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    7567076.6   5044.7
                                                                   ========
System Benchmarks Index Score (Partial Only)                         5044.7

> -----Original Message-----
> From: Eric Biggers <ebiggers@...nel.org>
> Sent: Wednesday, May 31, 2023 9:56 AM
> To: Christian Brauner <brauner@...nel.org>
> Cc: Chen, Zhiyin <zhiyin.chen@...el.com>; viro@...iv.linux.org.uk; linux-
> fsdevel@...r.kernel.org; linux-kernel@...r.kernel.org; Zou, Nanhai
> <nanhai.zou@...el.com>
> Subject: Re: [PATCH] fs.h: Optimize file struct to prevent false sharing
> 
> On Tue, May 30, 2023 at 10:50:42AM +0200, Christian Brauner wrote:
> > On Mon, May 29, 2023 at 10:06:26PM -0400, chenzhiyin wrote:
> > > In the syscall test of UnixBench, performance regression occurred
> > > due to false sharing.
> > >
> > > The lock and atomic members, including file::f_lock, file::f_count
> > > and file::f_pos_lock are highly contended and frequently updated in
> > > the high-concurrency test scenarios. perf c2c indentified one
> > > affected read access, file::f_op.
> > > To prevent false sharing, the layout of file struct is changed as
> > > following
> > > (A) f_lock, f_count and f_pos_lock are put together to share the
> > > same cache line.
> > > (B) The read mostly members, including f_path, f_inode, f_op are put
> > > into a separate cache line.
> > > (C) f_mode is put together with f_count, since they are used
> > > frequently at the same time.
> > >
> > > The optimization has been validated in the syscall test of
> > > UnixBench. performance gain is 30~50%, when the number of parallel
> > > jobs is 16.
> > >
> > > Signed-off-by: chenzhiyin <zhiyin.chen@...el.com>
> > > ---
> >
> > Sounds interesting, but can we see the actual numbers, please?
> > So struct file is marked with __randomize_layout which seems to make
> > this whole reordering pointless or at least only useful if the
> > structure randomization Kconfig is turned off. Is there any precedence
> > to optimizing structures that are marked as randomizable?
> 
> Most people don't use CONFIG_RANDSTRUCT.  So it's still worth optimizing
> struct layouts for everyone else.
> 
> - Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ