linux-kernel - Re: [Fonts] 9522750c66: fio.read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20201115124041.GA3793@shbuild999.sh.intel.com>
Date:   Sun, 15 Nov 2020 20:40:41 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        Lee Jones <lee.jones@...aro.org>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Russell King <linux@...linux.org.uk>,
        Peilin Ye <yepeilin.cs@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, zhengjun.xing@...el.com
Subject: Re: [Fonts]  9522750c66:  fio.read_iops 7.5% improvement

On Sat, Nov 14, 2020 at 01:25:44PM +0100, Greg Kroah-Hartman wrote:
> On Sat, Nov 14, 2020 at 03:19:17PM +0800, Feng Tang wrote:
> > Hi Greg,
> > 
> > On Fri, Nov 13, 2020 at 07:46:57AM +0100, Greg Kroah-Hartman wrote:
> > > On Thu, Nov 12, 2020 at 10:06:25PM +0800, kernel test robot wrote:
> > > > 
> > > > Greeting,
> > > > 
> > > > FYI, we noticed a 7.5% improvement of fio.read_iops due to commit:
> > > > 
> > > > 
> > > > commit: 9522750c66c689b739e151fcdf895420dc81efc0 ("Fonts: Replace discarded const qualifier")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > I strongly doubt this :)
> > 
> > We just double checked, the test was run 4 times and the result are
> > very stable.
> > 
> > The commit does looks irrelevant to fio test, and we just further
> > checked the System map of the 2 kernels, and many data's alignment
> > have been changed (systemmaps attached).
> > 
> > We have a hack debug patch to make data sections of each .o file to
> > be aligned, with that the fio result gap could be reduced from +7.5%
> > to +3.8%, so there is still some other factor affecting the benchmark,
> > which need more checking. And with the same debug method of forcing
> > data sections aligned, 2 other strange performance bumps[1][2] reported
> > by 0day could be recovered.
> > 
> > [1]. https://lore.kernel.org/lkml/20200205123216.GO12867@shao2-debian/
> > [2]. https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/
> 
> That's really odd.  Why wouldn't .o sections be aligned already and how
> does that affect the real .ko files that are created from that?  What
> alignment are you forcing?

Our debug patch is hacky which enforce 16K aligned (to adapt other rules
in the linker script to make kernel boot), as below:

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1bf7e31..de5ddc8 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -156,7 +156,9 @@ SECTIONS
	X86_ALIGN_RODATA_END
 
	/* Data */
-	.data : AT(ADDR(.data) - LOAD_OFFSET) {
+	.data : AT(ADDR(.data) - LOAD_OFFSET)
+	SUBALIGN(16384)
+	{
		/* Start of data section */
		_sdata = .;

And to make it boot, for our kernel config, we have to disable
CONFIG_DYNAMIC_DEBUG to avoid kernel panic.

> And also, what hardware is seeing this performance gains?  Something is
> fitting into a cache now that previously wasn't, and tracking that down
> seems like it would be very worthwhile as that is a non-trivial speedup
> that some developers take years to achieve with code changes.

It's a x86 server with 2S/48C/96T, and the fio parameters are:

	[global]
	bs=2M
	ioengine=mmap
	iodepth=32
	size=4473924266
	nr_files=1
	filesize=4473924266
	direct=0
	runtime=240
	invalidate=1
	fallocate=posix
	io_size=4473924266
	file_service_type=roundrobin
	random_distribution=random
	group_reporting
	pre_read=0

	time_based

	[task_0]
	rw=read
	directory=/fs/pmem0
	numjobs=24

	[task_1]
	rw=read
	directory=/fs/pmem1
	numjobs=24

And yes, we also think it's cacheline related, and we are further   
checking it. Actually we have 2 other similar strange performance
change checking ongoing:

https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
https://lore.kernel.org/lkml/20201004132838.GU393@shao2-debian/

So it may take some time. And to be frank, there have been quite
some old similar cases that we couldn't figure out the exact cause.

Thanks,
Feng

> thanks,
> 
> greg k-h