[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dee823ca-7100-4289-8670-95047463c09d@intel.com>
Date: Mon, 4 Mar 2024 13:35:10 +0800
From: "Yin, Fengwei" <fengwei.yin@...el.com>
To: Yujie Liu <yujie.liu@...el.com>, Jan Kara <jack@...e.cz>
CC: Oliver Sang <oliver.sang@...el.com>, <oe-lkp@...ts.linux.dev>,
<lkp@...el.com>, <linux-kernel@...r.kernel.org>, Andrew Morton
<akpm@...ux-foundation.org>, Matthew Wilcox <willy@...radead.org>, Guo Xuenan
<guoxuenan@...wei.com>, <linux-fsdevel@...r.kernel.org>,
<ying.huang@...el.com>, <feng.tang@...el.com>
Subject: Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput
-21.4% regression
Hi Jan,
On 3/4/2024 12:59 PM, Yujie Liu wrote:
> From the perf profile, we can see that the contention of folio lru lock
> becomes more intense. We also did a simple one-file "dd" test. Looks
> like it is more likely that low-order folios are allocated after commit
> ab4443fe3c (Fengwei will help provide the data soon). Therefore, the
> average folio size decreases while the total folio amount increases,
> which leads to touching lru lock more often.
I did following testing:
With a xfs image in tmpfs + mount it to /mnt and create 12G test file
(sparse-file), use one process to read it on a Ice Lake machine with
256G system memory. So we could make sure we are doing a sequential
file read with no page reclaim triggered.
At the same time, profiling the distribution of order parameter of
filemap_alloc_folio() call to understand how the large folio order
for page cache is generated.
Here is what we got:
- Commit f0b7a0d1d46625db:
$ dd bs=4k if=/mnt/sparse-file of=/dev/null
3145728+0 records in
3145728+0 records out
12884901888 bytes (13 GB, 12 GiB) copied, 2.52208 s, 5.01 GB/s
filemap_alloc_folio
page order : count distribution
0 : 57 | |
1 : 0 | |
2 : 20 | |
3 : 2 | |
4 : 4 | |
5 : 98300 |****************************************|
- Commit ab4443fe3ca6:
$ dd bs=4k if=/mnt/sparse-file of=/dev/null
3145728+0 records in
3145728+0 records out
12884901888 bytes (13 GB, 12 GiB) copied, 2.51469 s, 5.1 GB/s
filemap_alloc_folio
page order : count distribution
0 : 21 | |
1 : 0 | |
2 : 196615 |****************************************|
3 : 98303 |******************* |
4 : 98303 |******************* |
Even the file read throughput is almost same. But the distribution of
order looks like a regression with ab4443fe3ca6 (more smaller order
page cache is generated than parent commit). Thanks.
Regards
Yin, Fengwei
Powered by blists - more mailing lists