lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241209083618.2889145-1-chenridong@huaweicloud.com>
Date: Mon,  9 Dec 2024 08:36:17 +0000
From: Chen Ridong <chenridong@...weicloud.com>
To: akpm@...ux-foundation.org,
	mhocko@...e.com,
	hannes@...xchg.org,
	yosryahmed@...gle.com,
	yuzhao@...gle.com,
	david@...hat.com,
	willy@...radead.org,
	ryan.roberts@....com,
	baohua@...nel.org,
	21cnbao@...il.com,
	wangkefeng.wang@...wei.com
Cc: linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	chenridong@...wei.com,
	wangweiyang2@...wei.com,
	xieym_ict@...mail.com
Subject: [PATCH v4 0/1] mm: vmascan: retry folios written back while isolated for traditional LRU

From: Chen Ridong <chenridong@...wei.com>

The commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written back
while isolated") only fixed the issue for mglru. However, this issue
also exists in the traditional active/inactive LRU. Fix this issue
in the same way for active/inactive lru.

What is fixed:
The page reclaim isolates a batch of folios from the tail of one of the
LRU lists and works on those folios one by one.  For a suitable
swap-backed folio, if the swap device is async, it queues that folio for
writeback.  After the page reclaim finishes an entire batch, it puts back
the folios it queued for writeback to the head of the original LRU list.

In the meantime, the page writeback flushes the queued folios also by
batches.  Its batching logic is independent from that of the page reclaim.
For each of the folios it writes back, the page writeback calls
folio_rotate_reclaimable() which tries to rotate a folio to the tail.

folio_rotate_reclaimable() only works for a folio after the page reclaim
has put it back.  If an async swap device is fast enough, the page
writeback can finish with that folio while the page reclaim is still
working on the rest of the batch containing it.  In this case, that folio
will remain at the head and the page reclaim will not retry it before
reaching there.

---
v4:
 - conbine patch 1 and patch 2 together in v3.
 - refine commit msg.
 - fix builds errors reported-by: kernel test robot <lkp@...el.com>.
v3:
 - fix this issue in the same with way as multi-gen LRU.

v2:
 - detect folios whose writeback has done and move them to the tail
    of lru. suggested by Barry Song
[2] https://lore.kernel.org/linux-kernel/CAGsJ_4zqL8ZHNRZ44o_CC69kE7DBVXvbZfvmQxMGiFqRxqHQdA@mail.gmail.com/

v1:
[1] https://lore.kernel.org/linux-kernel/20241010081802.290893-1-chenridong@huaweicloud.com/

Chen Ridong (1):
  mm: vmascan: retry folios written back while isolated for traditional
    LRU

 include/linux/mmzone.h |   3 +-
 mm/vmscan.c            | 108 +++++++++++++++++++++++++++++------------
 2 files changed, 77 insertions(+), 34 deletions(-)

-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ