[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250516101054.676046-1-p.raghav@samsung.com>
Date: Fri, 16 May 2025 12:10:51 +0200
From: Pankaj Raghav <p.raghav@...sung.com>
To: "Darrick J . Wong" <djwong@...nel.org>,
hch@....de,
willy@...radead.org
Cc: linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
David Hildenbrand <david@...hat.com>,
linux-fsdevel@...r.kernel.org,
mcgrof@...nel.org,
gost.dev@...sung.com,
Andrew Morton <akpm@...ux-foundation.org>,
kernel@...kajraghav.com,
Pankaj Raghav <p.raghav@...sung.com>
Subject: [RFC 0/3] add large zero page for zeroing out larger segments
Introduce LARGE_ZERO_PAGE of size 2M as an alternative to ZERO_PAGE.
Similar to ZERO_PAGE, LARGE_ZERO_PAGE is also a global shared page.
2M seems to be a decent compromise between memory usage and performance.
This idea (but not the implementation) was suggested during the review of
adding LBS support to XFS[1][2].
NOTE:
===
This implementation probably has a lot of holes, and it is not complete.
For example, this implementation only works on x86.
The intent of the RFC is:
- To understand if this is something we still need in the kernel.
- If this is the approach we want to take to implement a feature like
this or should we explore other alternatives.
I have excluded a lot of Maintainers/mailing list and only included relevant
folks in this RFC to understand the direction we want to take if this
feature is needed.
===
There are many places in the kernel where we need to zeroout larger
chunks but the maximum segment we can zeroout at a time is limited by
PAGE_SIZE.
This is especially annoying in block devices and filesystems where we
attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage
bvec support in block layer, it is much more efficient to send out
larger zero pages as a part of a single bvec.
Some examples of places in the kernel where this could be useful:
- blkdev_issue_zero_pages()
- iomap_dio_zero()
- vmalloc.c:zero_iter()
- rxperf_process_call()
- fscrypt_zeroout_range_inline_crypt()
- bch2_checksum_update()
...
I have converted blkdev_issue_zero_pages() and iomap_dio_zero() as an
example as a part of this series.
While there are other options such as huge_zero_page, they can fail
based on the system conditions requiring a fallback to ZERO_PAGE[3].
LARGE_ZERO_PAGE is added behind a config option so that systems that are
constrained by memory are not forced to use it.
Looking forward to some feedback.
[1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@lst.de/
[2] https://lore.kernel.org/linux-xfs/ZitIK5OnR7ZNY0IG@infradead.org/
Pankaj Raghav (3):
mm: add large zero page for efficient zeroing of larger segments
block: use LARGE_ZERO_PAGE in __blkdev_issue_zero_pages()
iomap: use LARGE_ZERO_PAGE in iomap_dio_zero()
arch/Kconfig | 8 ++++++++
arch/x86/include/asm/pgtable.h | 20 +++++++++++++++++++-
arch/x86/kernel/head_64.S | 9 ++++++++-
block/blk-lib.c | 4 ++--
fs/iomap/direct-io.c | 31 +++++++++----------------------
5 files changed, 46 insertions(+), 26 deletions(-)
base-commit: 9e619cd4fefd19cdce16e169d5827bc64ae01aa1
--
2.47.2
Powered by blists - more mailing lists