[<prev] [next>] [day] [month] [year] [list]
Message-ID:
<SJ0PR10MB57658B01BF03025D0ED44CE6B9A62@SJ0PR10MB5765.namprd10.prod.outlook.com>
Date: Fri, 12 Jul 2024 01:17:56 +0000
From: Wayne Gao <Wayne.Gao1@...idigm.com>
To: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: RE: ask help to shed light on ext4 high CPU usage in kworker when
handling bufferIO while xfs is fine
Hello dear Linux FS developer,
I have one test result would like to ask the root cause
In the 2024 June, Linux kernel file system and Memory management summit, there is one interesting topic discussed. It basically mentioned that Linux kernel bound on 7GB/s when write using buffered IO. The root cause is that there is only one kworker thread to get the buffer IO job done. Since high BW gen5 NVMe will be adopted more and more. We did some experiment following the LWN discussion thread. https://lwn.net/Articles/976856/
Test Configuration
Kernel is Linux salab-bncbeta02 6.8.5-301.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 11 20:00:10 UTC 2024 x86_64 GNU/Linux
CPU is intel gen5 CPU
Fio is 3.37 above and latest
Test Result summary:
Test File system Buffer IO File write BW
1 XFS True 14.6 GB/s
2 XFS False 14.3 GB/s
3 Ext4 True 5383 MB/s
4 Ext4 False 14.5 GB/s
We can see with latest intel BNC gen5 platform and gen5 NVMe raid0 and kernel 6.8, we can only see the 7GB/s one kworker bound problem with Ext4 file system. XFS file system looks well, kworker CPU usage is only 20%. Please check Figure 1 and Figure 2
Figure 1. Ext4 shows 100% CPU on flush kworker thread
Figure 2. XFS shows 20% CPU on flush kworker thread
Enflame chart analysis.
Figure 3 depicts XFS most hot spot is on iomap_do_writepage and underlying implemented the latest Linux kernel memory framework folio that is better memory multipage framework. Figure 4 depicts Ext4 is truly CPU bound on the kwoker finally call into __folio_start_writeback. This is highly possible that XFS is 1st File system to implement iomap and folio, maybe ext4 still need some improvement.
But LWN article conclusion is right, one file system per volume so far have only one kworker thread even with kernel 6.8. different file system have different design, some leverage this one kworker more efficiently to get higher BW, others does not. But for high end NVMe like gen5 with raid0, raid5, direct IO will make more sense, you can get better BW than buffer IO and also save the DRAM for other cloud native tasks.
Figure 3. XFS hot spot on iomap_do_writepage
Figure 4. ext4 flush worker is 100% hotspot kswapd is relative high too
Wayne Gao
Principle Storage Solution Architect
wechat: 13636331364
solidigm.com
CONFIDENTIALITY NOTICE: This email and any files attached may contain confidential information and may be restricted from disclosure by corporate confidentiality guidelines, or applicable state and federal law. It is intended solely for the use of the person or entity to whom the email was addressed. If you are not the intended recipient of this message, be advised that any dissemination, distribution, or use of the contents of this message is strictly prohibited. Please delete this email from your system if you are not the intended recipient.
Powered by blists - more mailing lists