[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d4e37e2-417e-7a2f-a187-5c7b680c6777@intel.com>
Date: Tue, 2 Aug 2022 14:36:00 +0800
From: "Yin, Fengwei" <fengwei.yin@...el.com>
To: Dave Chinner <david@...morbit.com>,
kernel test robot <oliver.sang@...el.com>
CC: Dave Chinner <dchinner@...hat.com>,
"Darrick J. Wong" <djwong@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
<linux-xfs@...r.kernel.org>, <lkp@...ts.01.org>, <lkp@...el.com>
Subject: Re: [LKP] Re: [xfs] 016a23388c: stress-ng.xattr.ops_per_sec 58.4%
improvement
Hi Dave,
On 7/22/2022 6:01 AM, Dave Chinner wrote:
> A huge amount of spinlock contention in the xlog_commit_cil() path
> went away. The commit identified doesn't remove/change any
> spinlocks, it actually adds more overhead to the critical section of
> the above spinlock in preparation for removing said spinlocks.
>
> That removal happens in the next commit in that series - c0fb4765c508 ("xfs:
> convert CIL to unordered per cpu lists") - so I'd be expecting a
> bisect to demonstrate that the spinlock contention goes away with
> the commit that removed the spinlocks (as it does in all the testing
> of this I've done over the past 2 years), not the commit this bisect
> identified. Hence I think the bisect went wrong somewhere...
We did some investigation and got:
commit:
df7a4a2134b0a ("xfs: convert CIL busy extents to per-cpu")
016a23388cdcb ("xfs: Add order IDs to log items in CIL")
c0fb4765c5086 ("xfs: convert CIL to unordered per cpu lists")
df7a4a2134b0a201 016a23388cdcb2740deb1379dc4 c0fb4765c5086cfd00f1158f5f4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
62.07 +0.0% 62.09 -0.0% 62.06 stress-ng.time.elapsed_time
62.07 +0.0% 62.09 -0.0% 62.06 stress-ng.time.elapsed_time.max
2237 +0.0% 2237 +0.0% 2237 stress-ng.time.file_system_inputs
1842 ± 4% +16.9% 2152 ± 3% +17.6% 2166 stress-ng.time.involuntary_context_switches
551.00 -0.3% 549.10 -0.3% 549.40 stress-ng.time.major_page_faults
6376 -1.1% 6305 ± 2% +0.6% 6416 stress-ng.time.maximum_resident_set_size
9704 -0.3% 9676 -0.1% 9691 stress-ng.time.minor_page_faults
4096 +0.0% 4096 +0.0% 4096 stress-ng.time.page_size
841.90 -2.4% 821.70 -2.4% 821.90 stress-ng.time.percent_of_cpu_this_job_got
512.83 -3.4% 495.24 -3.6% 494.18 stress-ng.time.system_time
10.05 ± 8% +52.3% 15.30 ± 3% +61.1% 16.19 ± 2% stress-ng.time.user_time
2325 ± 16% +66.5% 3873 ± 7% +70.3% 3962 ± 6% stress-ng.time.voluntary_context_switches
1544 ± 4% +54.4% 2385 +63.9% 2531 stress-ng.xattr.ops
Yes. commit c0fb4765c5086 ("xfs: convert CIL to unordered per cpu lists")
could bring performance gain also. But the most performance gain (54.4%)
is from commit 016a23388cdcb ("xfs: Add order IDs to log items in CIL").
Based on commit 016a23388cdcb and add following change:
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 6bc540898e3a..7c6c91a0a12d 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -659,9 +659,14 @@ xlog_cil_insert_items(
continue;
lip->li_order_id = order;
- if (!list_empty(&lip->li_cil))
- continue;
- list_add_tail(&lip->li_cil, &cil->xc_cil);
+
+ /*
+ * Only move the item if it isn't already at the tail. This is
+ * to prevent a transient list_empty() state when reinserting
+ * an item that is already the only item in the CIL.
+ */
+ if (!list_is_last(&lip->li_cil, &cil->xc_cil))
+ list_move_tail(&lip->li_cil, &cil->xc_cil);
}
The performance will drop to the same level as commit df7a4a2134b0a
("xfs: convert CIL busy extents to per-cpu"):
commit:
016a23388cdcb2740deb1379dc408f21c84efb11
a8bef09e7d8e65207c8403e030a0965db43ce3de
016a23388cdcb274 a8bef09e7d8e65207c8403e030a
---------------- ---------------------------
%stddev %change %stddev
\ | \
62.06 -0.0% 62.05 stress-ng.time.elapsed_time
62.06 -0.0% 62.05 stress-ng.time.elapsed_time.max
2237 +0.0% 2237 stress-ng.time.file_system_inputs
2226 -16.7% 1855 ± 4% stress-ng.time.involuntary_context_switches
549.00 +0.5% 551.67 stress-ng.time.major_page_faults
6286 +0.1% 6292 stress-ng.time.maximum_resident_set_size
9636 +0.1% 9641 stress-ng.time.minor_page_faults
4096 +0.0% 4096 stress-ng.time.page_size
823.00 +3.0% 847.33 stress-ng.time.percent_of_cpu_this_job_got
496.02 +4.2% 516.61 stress-ng.time.system_time
15.08 -38.1% 9.33 ± 4% stress-ng.time.user_time
4034 ± 3% -43.0% 2299 ± 6% stress-ng.time.voluntary_context_switches
2368 -37.4% 1482 ± 4% stress-ng.xattr.ops
Regards
Yin, Fengwei
Powered by blists - more mailing lists