[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201115124041.GA3793@shbuild999.sh.intel.com>
Date: Sun, 15 Nov 2020 20:40:41 +0800
From: Feng Tang <feng.tang@...el.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: kernel test robot <oliver.sang@...el.com>,
Lee Jones <lee.jones@...aro.org>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Russell King <linux@...linux.org.uk>,
Peilin Ye <yepeilin.cs@...il.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, zhengjun.xing@...el.com
Subject: Re: [Fonts] 9522750c66: fio.read_iops 7.5% improvement
On Sat, Nov 14, 2020 at 01:25:44PM +0100, Greg Kroah-Hartman wrote:
> On Sat, Nov 14, 2020 at 03:19:17PM +0800, Feng Tang wrote:
> > Hi Greg,
> >
> > On Fri, Nov 13, 2020 at 07:46:57AM +0100, Greg Kroah-Hartman wrote:
> > > On Thu, Nov 12, 2020 at 10:06:25PM +0800, kernel test robot wrote:
> > > >
> > > > Greeting,
> > > >
> > > > FYI, we noticed a 7.5% improvement of fio.read_iops due to commit:
> > > >
> > > >
> > > > commit: 9522750c66c689b739e151fcdf895420dc81efc0 ("Fonts: Replace discarded const qualifier")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > >
> > > I strongly doubt this :)
> >
> > We just double checked, the test was run 4 times and the result are
> > very stable.
> >
> > The commit does looks irrelevant to fio test, and we just further
> > checked the System map of the 2 kernels, and many data's alignment
> > have been changed (systemmaps attached).
> >
> > We have a hack debug patch to make data sections of each .o file to
> > be aligned, with that the fio result gap could be reduced from +7.5%
> > to +3.8%, so there is still some other factor affecting the benchmark,
> > which need more checking. And with the same debug method of forcing
> > data sections aligned, 2 other strange performance bumps[1][2] reported
> > by 0day could be recovered.
> >
> > [1]. https://lore.kernel.org/lkml/20200205123216.GO12867@shao2-debian/
> > [2]. https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/
>
> That's really odd. Why wouldn't .o sections be aligned already and how
> does that affect the real .ko files that are created from that? What
> alignment are you forcing?
Our debug patch is hacky which enforce 16K aligned (to adapt other rules
in the linker script to make kernel boot), as below:
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1bf7e31..de5ddc8 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -156,7 +156,9 @@ SECTIONS
X86_ALIGN_RODATA_END
/* Data */
- .data : AT(ADDR(.data) - LOAD_OFFSET) {
+ .data : AT(ADDR(.data) - LOAD_OFFSET)
+ SUBALIGN(16384)
+ {
/* Start of data section */
_sdata = .;
And to make it boot, for our kernel config, we have to disable
CONFIG_DYNAMIC_DEBUG to avoid kernel panic.
> And also, what hardware is seeing this performance gains? Something is
> fitting into a cache now that previously wasn't, and tracking that down
> seems like it would be very worthwhile as that is a non-trivial speedup
> that some developers take years to achieve with code changes.
It's a x86 server with 2S/48C/96T, and the fio parameters are:
[global]
bs=2M
ioengine=mmap
iodepth=32
size=4473924266
nr_files=1
filesize=4473924266
direct=0
runtime=240
invalidate=1
fallocate=posix
io_size=4473924266
file_service_type=roundrobin
random_distribution=random
group_reporting
pre_read=0
time_based
[task_0]
rw=read
directory=/fs/pmem0
numjobs=24
[task_1]
rw=read
directory=/fs/pmem1
numjobs=24
And yes, we also think it's cacheline related, and we are further
checking it. Actually we have 2 other similar strange performance
change checking ongoing:
https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
https://lore.kernel.org/lkml/20201004132838.GU393@shao2-debian/
So it may take some time. And to be frank, there have been quite
some old similar cases that we couldn't figure out the exact cause.
Thanks,
Feng
> thanks,
>
> greg k-h
Powered by blists - more mailing lists