lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqexmNIc00Xlwy2c@casper.infradead.org>
Date: Mon, 29 Jul 2024 16:13:28 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Barry Song <21cnbao@...il.com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org, ying.huang@...el.com,
	baolin.wang@...ux.alibaba.com, chrisl@...nel.org, david@...hat.com,
	hannes@...xchg.org, hughd@...gle.com, kaleshsingh@...gle.com,
	kasong@...cent.com, linux-kernel@...r.kernel.org, mhocko@...e.com,
	minchan@...nel.org, nphamcs@...il.com, ryan.roberts@....com,
	senozhatsky@...omium.org, shakeel.butt@...ux.dev,
	shy828301@...il.com, surenb@...gle.com, v-songbaohua@...o.com,
	xiang@...nel.org, yosryahmed@...gle.com,
	Chuanhua Han <hanchuanhua@...o.com>
Subject: Re: [PATCH v5 3/4] mm: support large folios swapin as a whole for
 zRAM-like swapfile

On Tue, Jul 30, 2024 at 01:11:31AM +1200, Barry Song wrote:
> for this zRAM case, it is a new allocated large folio, only
> while all conditions are met, we will allocate and map
> the whole folio. you can check can_swapin_thp() and
> thp_swap_suitable_orders().

YOU ARE DOING THIS WRONGLY!

All of you anonymous memory people are utterly fixated on TLBs AND THIS
IS WRONG.  Yes, TLB performance is important, particularly with crappy
ARM designs, which I know a lot of you are paid to work on.  But you
seem to think this is the only consideration, and you're making bad
design choices as a result.  It's overly complicated, and you're leaving
performance on the table.

Look back at the results Ryan showed in the early days of working on
large anonymous folios.  Half of the performance win on his system came
from using larger TLBs.  But the other half came from _reduced software
overhead_.  The LRU lock is a huge problem, and using large folios cuts
the length of the LRU list, hence LRU lock hold time.

Your _own_ data on how hard it is to get hold of a large folio due to
fragmentation should be enough to convince you that the more large folios
in the system, the better the whole system runs.  We should not decline to
allocate large folios just because they can't be mapped with a single TLB!


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ